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TITLE 

S-ADENOSYL-L-METHIONINE SYNTHETASE PROMOTER 
AND ITS USE IN EXPRESSION OF TRANSGENIC GENES IN PLANTS 

FIELD OF THE INVENTION 
5 This invention relates to a plant promoter, in particular, to an S-adenosyl-L- 

methionine synthetase (SAMS) promoter and subfragments thereof and their use in regulating 
the expression of at least one heterologous nucleic acid fragment in plants. 

BACKGROUND OF THE INVENTION 
Recent advances in plant genetic engineering have opened new doors to engineer 
10 plants having improved characteristics or traits, such as, resistance to plant diseases, insect 
resistance, herbicidal resistance, enhanced stability or shelf-life of the ultimate consumer 
product obtained from the plants and improvement of the nutritional quality of the edible 
portions of the plant. Thus, a desired gene (or genes) from a source different than the plant, 
but engineered to impart different or improved characteristics or qualities, can be 
15 incorporated into the plant's genome. This new gene (or genes) can then be expressed in the 
plant cell to exhibit the new trait or characteristic. 

In order to obtain expression of the newly inserted gene in the plant cell, the proper 
regulatory signals must be present and be in the proper location with respect to the gene. 
These regulatory signals include a promoter region, a 5' non-translated leader sequence and a 
20 3' transcription termination/polyadenylation sequence. 

A promoter is a DNA sequence that directs cellular machinery of a plant to produce 
RNA from the contiguous coding sequence downstream (3') of the promoter. The promoter 
region influences the rate, developmental stage, and cell type in which the RNA transcript of 
the gene is made. The RNA transcript is processed to produce messenger RNA (mRNA) 
25 which serves as a template for translation of the RNA sequence into the amino acid sequence 
of the encoded polypeptide. The 5 1 non-translated leader sequence is a region of the mRNA 
upstream of the protein coding region that may play a role in initiation and translation of the 
mRNA. The 3' transcription termination/polyadenylation signal is a non-translated region 
downstream of the protein coding region that functions in the plant cells to cause termination 
30 of the RNA transcript and the addition of polyadenylate nucleotides to the 3* end of the RNA. 

It has been shown that certain promoters are able to direct RNA synthesis at a higher 
rate than others. These are called "strong promoters". Certain other promoters have been 
shown to direct RNA production at higher levels only in particular types of cells or tissues 
and are often referred to as "tissue specific promoters". In this group, many seed storage 
35 protein genes' promoters have been well characterized and widely used, such as the phaseolin 
gene promoter of Phaseolus vulgaris, the helianthinin gene of sunflower, the p-conglycinin 
gene of soybean (Chen et al., (1989) Dev. Genet 10, 1 12-122), the napin gene promoter of 
Brassica napus (Ellerstrom et al, (1996) Plant Mol. Biol 32, 1019-1027), the oleosin gene 
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promoters of Br assica and Arabidopsis (Keddie et al, (1994) Plant Mol Biol 24, 327-340; 
Li, (1997) Texas A&M Ph.D. dissertation, pp. 107-128; Plant et al, (1994) Plant Mol Biol 
25, 193-205). Another class of tissue specific promoters is described in, U.S. Patent 
No. 5,589,583, issued to Klee et al. on December 31, 1996; these plant promoters are capable 
5 of conferring high levels of transcription of chimeric genes in meristematic tissues and/or 
rapidly dividing cells. In contrast to tissue-specific promoters, "inducible promoters" direct 
RNA production in response to certain environmental factors, such as heat shock, light, 
hormones, ion concentrations etc. (Espartero et al, (1994) Plant Mol Biol 25, 217-227; 
Gomez-Gomez and Carrasco, (1998) Plant Physiol 117, 397-405; Holtorf et al, (1995) Plant 

10 Mol Biol 29, 637-646; MacDowell et al, (1996) Plant Physiollll, 699-71 1; Mathur et al, 
(1992) Biochem. Biophys. Acta 1137, 338-348; Mett et al, (1996) Transgenic Res. 5, 
105-1 13; Schoffl et al, (1989) Mol Gen. Genet 217, 246-253; Ulmasov et al, (1995) Plant 
Physiol 70S, 919-927). 

Promoters that are capable of directing RNA production in many or all tissues of a 

1 5 plant are called "constitutive promoters". The ideal constitutive promoter should be able to 
drive gene expression in all cells of the organism throughout its development. Expression of 
many so-called constitutive genes, such as actin (McDowell et al., (1996) Plant Physiol 111, 
699-71 1; Wang et al., (1992) Mol Cell Biol 12, 3399-3406), and ubiquitin (Callis et al, 
(1990) J. Biol Chem. 265, 12486-12493; Rollfmke et al, (1998) Gene 211, 267-276) varies 

20 depending on the tissue types and developmental stages of the plant. The most widely used 
constitutive promoter, the cauliflower mosaic virus 35S promoter, also shows variations in 
activity in different plants and in different tissues of the same plant (Atanassova et al., (1998) 
PlantMol Biol 37, 275-285; Battraw and Hall, (1990J Plant Mol Biol 15, 527-538; Holtorf 
et al., (1995) Plant Mol Biol 29, 637-646; Jefferson et al., (1987) EMBO J. 6, 3901-3907; 

25 Wilmink et al., (1995) Plant Mol Biol 28, 949-955). The cauliflower mosaic virus 35S 

promoter is also described in U.S. Patent No. 5,106,739. The tissue-specific expression and 
synergistic interactions of sub-domains of the promoter of cauliflower mosaic virus are 
discussed in U.S. Patent No. 5,097,025, which issued to Benfey et al. on March 17, 1992. A 
Br assica promoter (hsp80) that provides for constitutive expression of heterologous genes in 

30 a wide range of tissues and organs is discussed in U.S. Patent No. 5,612,472 which issued to 
Wilson et al. on March 18, 1997. 

Since the patterns of expression of a chimeric gene (or genes) introduced into a plant 
are controlled using promoters, there is an ongoing interest in the isolation and identification 
of novel promoters which are capable of controlling expression of a chimeric gene or (genes). 

35 SUMMARY OF THE INVENTION 

This invention concerns an isolated nucleic acid fragment comprising a promoter 
wherein said promoter consists essentially of the nucleotide sequence set forth in SEQ ID 
NOs:6, 14, 15, or 16 or said promoter consists essentially of a fragment or subfragment that is 



substantially similar and functionally equivalent to the nucleotide sequence set forth in SEQ 
IDNOs:6, 14, 15, or 16. 

In a second embodiment, this invention concerns a chimeric gene comprising at least 
one heterologous nucleic acid fragment operably linked to the promoter of the invention. 
5 In a third embodiment, this invention concerns plants containing this chimeric gene 

and seeds obtained from such plants. 

In a fourth embodiment, this invention concerns a method of increasing or decreasing 
the expression of at least one heterologous nucleic acid fragment in a plant cell which 
comprises: 

10 (a) transforming a plant cell with the chimeric gene described above; 

(b) growing fertile mature plants from the transformed plant cell of step (a); 

(c) selecting plants containing the transformed plant cell wherein the expression 
of the heterologous nucleic acid fragment is increased or decreased. 

In a fifth embodiment, this invention concerns an isolated nucleic acid fragment 
1 5 comprising a constitutive plant SAMS promoter. 

BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCES 
The invention can be more fully understood from the following detailed description, 
the drawings and the Sequence Descriptions that form a part of this application. The 
Sequence Descriptions contain the three letter codes for amino acids as defined in 37 C.F.R. 
20 §§ 1.821-1.825, which are incorporated herein by reference. 

SEQ ID NO:l is the nucleotide sequence comprising the entire cDNA insert in clone 
s2.12b06 which encodes a soybean S-adenosyl-L-methionine synthetase protein. 

SEQ ID NO:2 is the nucleotide sequence comprising a soybean S-adenosyl-L- 
methionine synthetase genomic DNA fragment. 
25 SEQ ID NO:3 is the nucleotide sequence of a portion of the cDNA insert in clone 

srrlc.pk002.b21 encoding a portion of a soybean S-adenosyl-L-methionine synthetase 
protein. 

SEQ ID NO:4 is a 32 base oligonucleotide primer, designated sam-5, used to amplify 
the soybean S-adenosyl-L-methionine synthetase promoter region via PCR. 
30 SEQ ID NO:5 is a 24 base oligonucleotide primer, designated sam-6, used to amplify 

the soybean S-adenosyl-L-methionine synthetase promoter region via PCR. 

SEQ ID NO:6 is the nucleotide sequence comprising a soybean S-adenosyl-L- 
methionine synthetase promoter fragment produced via PCR using primers sam-5 (SEQ ID 
NO:4) and sam-6 (SEQ ID NO:5). 
35 SEQ ID NO: 7 is a 22 base oligonucleotide primer, designated sam-9, used to amplify 

the soybean S-adenosyl-L-methionine synthetase promoter region via PCR. 
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SEQ ID NO: 8 is a 19 base oligonucleotide primer, designated atps-9 ? used to amplify 
a chimeric gene comprising a SAMS promoter fragment and a portion of the ATP sulfurylase 
(ATPS) gene via PCR. 

SEQ ID NO:9 is a 21 base oligonucleotide primer, designated cgs-8 ? used to amplify a 
5 chimeric gene comprising a SAMS promoter and a portion of the cystathionine-y-synthase 1 
(CGS1) gene via PCR. 

SEQ ID NO: 10 is a 20 base oligonucleotide antisense primer, designated atps-4 ? used 
to amplify the ATP sulfurylase transcript via RT-PCR. 

SEQ ID NO:l 1 is a 21 base oligonucleotide antisense primer, designated cgs-10, used 
1 0 to amplify the cystathionine-y-synthase 1 transcript via RT-PCR. 

SEQ ID NO: 12 is a 20 base oligonucleotide primer, designated atps-3, used to amplify 
an ATP sulfurylase cDNA via PCR. 

SEQ ID NO: 13 is a 23 base oligonucleotide primer, designated cgs-9 ? used to amplify 
a cystathionine-y-synthase 1 cDNA via PCR. 
15 SEQ ID NO: 14 is a 2165 nucleotide sequence comprising a soybean S-adenosyl-L- 

methionine synthetase genomic DNA fragment which starts at the 5 f end of SEQ ID NO:2, 
and ends at the ATG translation start codon of the S-adenosyl-L-methionine synthetase. 

SEQ ID NO: 15 is a 1574 nucleotide sequence comprising a DNA fragment which 
starts at the 5' end of SEQ ID NO:2, and ends at the ATG translation start codon of the 
20 S-adenosyl-L-methionine synthetase, and wherein a 591 nucleotide intron sequence has been 
removed. 

SEQ ID NO: 16 is a 719 nucleotide sequence comprising a DNA fragment which 
starts at nucleotide 4 of SEQ ID NO:6, and ends at the ATG translation start codon of the 
S-adenosyl-L-methionine synthetase, and wherein a 591 nucleotide intron sequence has been 
25 removed. 

SEQ ID NO: 17 is a 6975 nucleotide sequence comprising plasmid pMH40A. 

SEQ ID NO: 18 is a 3985 nucleotide sequence comprising a SAMS promoter: :GUS::3' 
Nos DNA fragment present in plasmid pZSLl 1 . 

SEQ ID NO: 19 is a 3684 nucleotide sequence comprising a SAMS 
30 promoter :: ATPS : :3 f Nos DNA fragment 

SEQ ID NO:20 is a 3963 nucleotide sequence comprising a SAMS 
promoter::CGSl::3' Nos DNA fragment. 

Figures 1 A and IB depict Southern hybridization analyses of SAMS genes. Soybean 
genomic DNA was digested with BamHI, EcoRI, Hindlll, Kpnl, and Sad , and then the blot 
35 was hybridized with a full length SAMS cDNA (SEQ ID NO: 1) probe in Figure 1 A or with a 
SAMS promoter fragment (SEQ ID NO:6) probe inFigure IB. 

Figure 2 depicts a SAMS genomic DNA sequence (SEQ ID NO:2) and the alignment 
of the overlapping region with SAMS cDNA sequence (SEQ ID NO:l). The 2336 bp SAMS 



genomic DNA sequence has a 191 bp region aligned with the 5 f end sequence of the SAMS 
cDNA with six mismatches. The region used to make the SAMS promoter by adding the 
Ncol site at its 3' end is underlined. The translation start codon is in bold. 

Figure 3 depicts the structure of the SAMS::GUS expression cassette. The SAMS 
5 promoter was cloned into pMH40A to replace its 35S promoter. The structure of the resulted 
SAMS::GUS construct was generated by Vector NTI™ software (InforMax, Inc., North 
Bethesda, MD). 

Figure 4 depicts a histochemical GUS expression analysis of transgenic Arabidopsis 
plants harboring the SAMS::GUS expression cassette. Arabidopsis tissues were incubated at 
10 37°C with X-Gluc overnight and dehydrated with ethanol. (A) Flower buds; (B) leaf; 
(C) Inflorescence stem and a cauline leaf; (D, E, F) developing siliques; (G) Developing 
seeds and embryos. All of the seeds were derived from GUS-positive siliques. Genetic 
segregation of the GUS gene was demonstrated by the blue funiculus of the white seed in the 
right upper corner. 

15 Figure 5 depicts a fluorometric GUS expression assay of transgenic Arabidopsis 

plants harboring the SAMS::GUS expression cassette. Triple samples of flowers, leaves, 
stems, siliques coats, young seeds, medium seeds, old seeds, and dry seeds collected from 
SAMS::GUS transgenic Arabidopsis plants were assayed for GUS activity. The graph was 
generated by Microsoft Excel and the standard deviation is indicated by the upper part of 

20 each column. 

Figure 6 depicts a histochemical GUS transient expression analysis of SAMS 
promoter in corn. The pZSLl 1 (SAMS::GUS) or the pMH40A (35S::GUS) plasmid DNA 
was delivered into corn callus (A, C) or leaf discs (B, D), and the GUS activity was detected 
by incubation with X-Gluc overnight at 37° C. (A, B) Transformed with pZSLl 1 DNA; (C, 

25 D) Transformed with pMH40A DNA. 

Figures 7(A) and 7(B) depict the presence and expression of transgenic soybean 
ATPS and CGS1 genes controlled by the SAMS promoter in transgenic Arabidopsis plants. 
Figure 7(A) is a PCR analysis. Genomic DNA of ten transgenic Arabidopsis plants (1 to 10), 
wild type Arabidopsis (a), wild type soybean (s), and plasmid DNA of SAMS::CGS1 or 

30 SAMS: :ATPS in binary vectors (p) were used as templates in PCR with gene-specific 

primers. PCR often SAMS::CGS1 transgenic plants with primer sam-9 which is specific to 
SAMS promoter, and primer cgs-8 which is specific to soybean CGS1 (upper). PCR often 
SAMS:: ATPS transgenic plants with primer sam-9 which is specific to SAMS promoter, and 
primer atps-1 which is specific to soybean ATPS gene (lower). Figure 7(B) is an RT-PCR 

35 analysis. Total leaf RNA often transgenic Arabidopsis plants (1 to 1 0), wild type 

Arabidopsis (a), and wild type soybean (s) were used as templates in RT-PCR with gene- 
specific primers. First strand cDNA was synthesized from a gene-specific antisense primer 
with reverse transcriptase, and then the first strand cDNA was amplified by PCR with both 



sense and antisense primers. RT-PCR often SAMS::CGS1 transgenic plants with primers, 
cgs-9 (sense) and cgs-10 (antisense), specific to soybean CGS1 gene (upper). RT-PCR of ten 
SAMS::ATPS transgenic plants with primers, atps-3 (sense) and atps-4 (antisense), specific 
to soybean ATPS gene (lower). 
5 Figure 8 depicts induction of SAMS promoter activity by methionine. Seeds of ten 

transgenic Arabidopsis lines transformed with SAMS::GUS construct were germinated on 
filter papers soaked with H 2 0, lx Murashige and Skoog salt, 0.01 mM, and 1 mM 
methionine. Ten days old seedlings were harvested and assayed for GUS activity. The solid 
bar and hollow bar indicate, respectively, the average and the standard variation of three 
1 0 samples for each treatment. 

Figure 9 depicts a northern hybridization. Soybean total RNAs from leaves, roots, 
stems, young seeds, medium seeds, old seeds, and pod coats (L, R, S, Y, M, O, and P) were 
used to make the RNA blot which was hybridized with a full length SAMS cDNA (SEQ ID 
NO: 1) probe. 

15 DETAILED DESCRIPTION OF THE INVENTION 

In the context of this disclosure, a number of terms shall be utilized. 
As used herein, an "isolated nucleic acid fragment" is a polymer of ribonucleotides 
(RNA) or deoxyribonucleotides (DNA) that is single- or double-stranded, optionally 
containing synthetic, non-natural or altered nucleotide bases. An isolated nucleic acid 

20 fragment in the form of DNA may be comprised of one or more segments of cDNA, genomic 
DNA or synthetic DNA. 

The terms "subfragment that is functionally equivalent" and "functionally equivalent 
subfragment" are used interchangeably herein. These terms refer to a portion or 
subsequence of an isolated nucleic acid fragment in which the ability to alter gene 

25 expression or produce a certain phenotype is retained whether or not the fragment or 

subfragment encodes an active enzyme. For example, the fragment or subfragment can be 
used in the design of chimeric genes to produce the desired phenotype in a transformed 
plant. Chimeric genes can be designed for use in co-suppression or antisense by linking a 
nucleic acid fragment or subfragment thereof, whether or not it encodes an active enzyme, in 

30 the appropropriate orientation relative to a plant promoter sequence. 

The terms "substantially similar" and "corresponding substantially" as used herein 
refer to nucleic acid fragments wherein changes in one or more nucleotide bases does not 
affect the ability of the nucleic acid fragment to mediate gene expression or produce a 
certain phenotype. These terms also refer to modifications of the nucleic acid fragments of 

35 the instant invention such as deletion or insertion of one or more nucleotides that do not 

substantially alter the functional properties of the resulting nucleic acid fragment relative to 
the initial, unmodified fragment. It is therefore understood, as those skilled in the art will 
appreciate, that the invention encompasses more than the specific exemplary sequences. 



Moreover, the skilled artisan recognizes that substantially similar nucleic acid 
sequences encompassed by this invention are also defined by their ability to hybridize, under 
moderately stringent conditions (for example, 0.5 X SSC, 0.1% SDS, 60° C) with the 
sequences exemplified herein, or to any portion of the nucleotide sequences reported herein 
5 and which are functionally equivalent to the promoter of the invention. Preferred 
substantially similar nucleic acid sequences encompassed by this invention are those 
sequences that are 80% identical to the nucleic acid fragments reported herein or which are 
80%> identical to any portion of the nucleotide sequences reported herein. More preferred 
are nucleic acid fragments which are 90% identical to the nucleic acid sequences reported 

10 herein, or which are 90% identical to any portion of the nucleotide sequences reported 
herein. Most preferred are nucleic acid fragments which are 95% identical to the nucleic 
acid sequences reported herein, or which are 95% identical to any portion of the nucleotide 
sequences reported herein. Sequence alignments and percent similarity calculations may be 
determined using the Megalign program of the LAS ARGENE bioinformatics computing 

15 suite (DNASTAR Inc., Madison, WI). Multiple alignment of the sequences are performed 
using the Clustal method of alignment (Higgins and Sharp (1989) CABIOS. 5:151-153) with 
the default parameters (GAP PEN ALT Y= 10, GAP LENGTH PEN ALT Y= 10). Default 
parameters for pairwise alignments and calculation of percent identiy of protein sequences 
using the Clustal method are KTUPLE=1, GAP PENALTY=3, WINDOW-5 and 

20 DIAGONALS SAVED=5. For nucleic acids these parameters are GAP PEN ALT Y= 10, 
GAP LENGTH PENALTY-10, KTUPLE=2, GAP PENALTY=5, WINDOW-4 and 
DIAGONALS SAVED=4. A "substantial portion" of an amino acid or nucleotide sequence 
comprises enough of the amino acid sequence of a polypeptide or the nucleotide sequence of 
a gene to afford putative identification of that polypeptide or gene, either by manual 

25 evaluation of the sequence by one skilled in the art, or by computer-automated sequence 
comparison and identification using algorithms such as BLAST (Altschul, S. F., et al., 
(1993; J. Mol Biol 275:403-410) and Gapped Blast (Altschul, S. F. et al., (1997) Nucleic 
Acids Res. 25:3389-3402); see also www.ncbi.nlm.nih.gov/BLASTA . 

"Gene" refers to a nucleic acid fragment that expresses a specific protein, including 

30 regulatory sequences preceding (5' non-coding sequences) and following (3 f non-coding 

sequences) the coding sequence. "Native gene" refers to a gene as found in nature with its 
own regulatory sequences. "Chimeric gene" refers to any gene that is not a native gene, 
comprising regulatory and coding sequences that are not found together in nature. 
Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that 

35 are derived from different sources, or regulatory sequences and coding sequences derived 
from the same source, but arranged in a manner different than that found in nature. 
"Endogenous gene" refers to a native gene in its natural location in the genome of an 
organism. A "foreign" gene refers to a gene not normally found in the host organism, but that 



is introduced into the host organism by gene transfer. Foreign genes can comprise native 
genes inserted into a non-native organism, or chimeric genes. A "transgene" is a gene that 
has been introduced into the genome by a transformation procedure. 

A "heterologous nucleic acid fragment" refers to a nucleic acid fragment comprising a 
5 nucleic acid sequence that is different from the nucleic acid sequence comprising the plant 
promoter of the invention. 

"Coding sequence" refers to a DNA sequence that codes for a specific amino acid 
sequence. "Regulatory sequences" refer to nucleotide sequences located upstream (5 ? non- 
coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, 

1 0 and which influence the transcription, RNA processing or stability, or translation of the 
associated coding sequence. Regulatory sequences may include, but are not limited to, 
promoters, translation leader sequences, introns, and polyadenylation recognition sequences. 

"Promoter" refers to a DNA sequence capable of controlling the expression of a 
coding sequence or functional RNA. The promoter sequence consists of proximal and more 

15 distal upstream elements, the latter elements often referred to as enhancers. Accordingly, an 
"enhancer" is a DNA sequence which can stimulate promoter activity and may be an innate 
element of the promoter or a heterologous element inserted to enhance the level or tissue- 
specificity of a promoter. Promoters may be derived in their entirety from a native gene, or 
be composed of different elements derived from different promoters found in nature, or even 

20 comprise synthetic DNA segments. It is understood by those skilled in the art that different 
promoters may direct the expression of a gene in different tissues or cell types, or at different 
stages of development, or in response to different environmental conditions. Promoters 
which cause a gene to be expressed in most cell types at most times are commonly referred to 
as "constitutive promoters". New promoters of various types useful in plant cells are 

25 constantly being discovered; numerous examples may be found in the compilation by 

Okamuro and Goldberg (1989, Biochemistry of Plants 15:1-82). It is further recognized that 
since in most cases the exact boundaries of regulatory sequences have not been completely 
defined, DNA fragments of some variation may have identical promoter activity. An "intron" 
is an intervening sequence in a gene that is transcribed into RNA but is then excised in the 

30 process of generating the mature mRNA. The term is also used for the excised RNA 

sequences. An "exon" is a portion of the sequence of a gene that is transcribed and is found 
in the mature messenger RNA derived from the gene, but is not necessarily a part of the 
sequence that encodes the final gene product. 

The "translation leader sequence" refers to a DNA sequence located between the 

35 promoter sequence of a gene and the coding sequence. The translation leader sequence is 
present in the fully processed mRNA upstream of the translation start sequence. The 
translation leader sequence may affect processing of the primary transcript to mRNA, mRNA 
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stability or translation efficiency. Examples of translation leader sequences have been 
described (Turner, R. and Foster, G. D. (1995) Molecular Biotechnology 5:225). 

The "3' non-coding sequences" refer to DNA sequences located downstream of a 
coding sequence and include polyadenylation recognition sequences and other sequences 
5 encoding regulatory signals capable of affecting mRNA processing or gene expression. The 
polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid 
tracts to the 3' end of the mRNA precursor. The use of different 3' non-coding sequences is 
exemplified by Ingelbrecht et al., (1989) Plant Cell 7:671-680. 

"RNA transcript" refers to a product resulting from RNA polymerase-catalyzed 

10 transcription of a DNA sequence. When an RNA transcript is a perfect complementary copy 
of a DNA sequence, it is referred to as a primary transcript or it may be a RNA sequence 
derived from posttranscriptional processing of a primary transcript and is referred to as a 
mature RNA. "Messenger RNA" ("mRNA") refers to RNA that is without introns and that 
can be translated into protein by the cell. "cDNA" refers to a DNA that is complementary to 

1 5 and synthesized from an mRNA template using the enzyme reverse transcriptase. The 
cDNA can be single-stranded or converted into the double-stranded by using the klenow 
fragment of DNA polymerase I. "Sense" RNA refers to RNA transcript that includes 
mRNA and so can be translated into protein within a cell or in vitro. "Antisense RNA" 
refers to a RNA transcript that is complementary to all or part of a target primary transcript 

20 or mRNA and that blocks expression or transcripts accumulation of a target gene (U.S. 

Patent No. 5,107,065). The complementarity of an antisense RNA may be with any part of 
the specific gene transcript, i.e. at the 5' non-coding sequence, 3' non-coding sequence, 
introns, or the coding sequence. "Functional RNA" refers to antisense RNA, ribozyme 
RNA, or other RNA that may not be translated but yet has an effect on cellular processes. 

25 "Sense" RNA refers to RNA transcript that includes the mRNA and so can be 

translated into protein by the cell. "Antisense RNA" refers to a RNA transcript that is 
complementary to all or part of a target primary transcript or mRNA and that blocks the 
expression of a target gene (U.S. Patent No. 5,107,065. The complementarity of an antisense 
RNA may be with any part of the specific gene transcript, i.e., at the 5' non-coding sequence, 

30 3' non-coding sequence, introns, or the coding sequence. "Functional RNA" refers to 

antisense RNA, ribozyme RNA, or other RNA that may not be translated but yet has an effect 
on cellular processes. 

The term "operably linked" refers to the association of nucleic acid sequences on a 
single nucleic acid fragment so that the function of one is affected by the other. For example, 

35 a promoter is operably linked with a coding sequence when it is capable of affecting the 

expression of that coding sequence (i.e., that the coding sequence is under the transcriptional 
control of the promoter). Coding sequences can be operably linked to regulatory sequences 
in sense or antisense orientation. 



The term "expression", as used herein, refers to the production of a functional end- 
product. Expression or overexpression of a gene involves transcription of the gene and 
translation of the mRNA into a precursor or mature protein. "Antisense inhibition" refers to 
the production of antisense RNA transcripts capable of suppressing the expression of the 
5 target protein. "Overexpression" refers to the production of a gene product in transgenic 
organisms that exceeds levels of production in normal or non-transformed organisms. 
"Co-suppression" refers to the production of sense RNA transcripts capable of suppressing 
the expression or transcript accumulation of identical or substantially similar foreign or 
endogenous genes (U.S. Patent No. 5,231,020). The mechanism of co-suppression may be 

10 at the DNA level (such as DNA methylation), at the transcriptional level, or at post- 
transcriptional level. 

"Altered expression" refers to the production of gene product(s) in transgenic 
organisms in amounts or proportions that differ significantly from the amount of the gene 
product(s) produced by the corresponding wild-type organisms. 

15 "Transformation" refers to the transfer of a nucleic acid fragment into the genome of a 

host organism, resulting in genetically stable inheritance. Host organisms containing the 
transformed nucleic acid fragments are referred to as "transgenic" organisms. The preferred 
method of corn cell transformation is use of particle-accelerated or "gene gun" transformation 
technology (Klein et al. (1987) Nature (London) 327:70-73; U.S. Patent No. 4,945,050). 

20 Standard recombinant DNA and molecular cloning techniques used herein are well 

known in the art and are described more fully in Sambrook, J., Fritsch, E. F. and Maniatis, T., 
Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, 1989 (hereinafter "Sambrook et al., 1989") or Ausubel, F. M., Brent, R., 
Kingston, R. E., Moore, D. D., Seidman, J. G., Smith, J. A. and Struhl, K. (eds.), Current 

25 Protocols in Molecular Biology , John Wiley and Sons, New York, 1990 (hereinafter 
"Ausubel etaL, 1990"). 

"PCR" or "Polymerase Chain Reaction" is a technique for the synthesis of large 
quantities of specific DNA segments, consists of a series of repetitive cycles (Perkin Elmer 
Cetus Instruments, Norwalk, CT). Typically, the double stranded DNA is heat denatured, 

30 the two primers complementary to the 3 f boundaries of the target segment are annealed at 
low temperature and then extended at an intermediate temperature. One set of these three 
consecutive steps comprises a cycle. 

An "expression construct" is a plasmid vector or a subfragment thereof comprising 
the instant chimeric gene. The choice of plasmid vector is dependent upon the method that 

35 will be used to transform host plants. The skilled artisan is well aware of the genetic 

elements that must be present on the plasmid vector in order to successfully transform, select 
and propagate host cells containing the chimeric gene. The skilled artisan will also 
recognize that different independent transformation events will result in different levels and 

10 



patterns of expression (Jones et al., (1985) EMBOJ. 4:241 1-2418; De Almeida et al., (1989) 
Mol Gen. Genetics 218:78-86), and thus that multiple events must be screened in order to 
obtain lines displaying the desired expression level and pattern. Such screening may be 
accomplished by Southern analysis of DNA, Northern analysis of mRNA expression, 
5 Western analysis of protein expression, or phenotypic analysis. 

Although the SAMS enzyme is present in most plant cell types, no SAMS promoter 
capable of driving gene expression in most or all plant cell types has been described. 
Previous studies indicated that plants contain multiple SAMS genes which are differentially 
expressed in response to various stresses (Schroder et ah (1997) Plant Mol. Biol. 

10 33 :21 1 -222). A SAMS promoter that is preferentially active in a particular tissue type, i.e. 
vascular (Peleman et al, (1989) Plant Cell 1, 81-93; Mijnsbrugge et a/., (1996) Plant Cell 
Physiol. 37, 1 108-1 115), was also known. However, it was not possible to predict, before the 
studies reported herein, whether any SAMS gene was controlled by a constitutive promoter. 
It is demonstrated herein that constitutive SAMS promoters do, in fact, exist in plants, and 

15 that such promoters can be readily isolated and used by one skilled in the art. 

This invention concerns an isolated nucleic acid fragment comprising a constitutive 
plant SAMS promoter. This invention also concerns an isolated nucleic acid fragment 
comprising a promoter wherein said promoter consists essentially of the nucleotide sequence 
set forth in SEQ ID NOs:6, 14, 15 or 16 or said promoter consists essentially of a fragment or 

20 subfragment that is substantially similar and functionally equivalent to the nucleotide 
sequence set forth in SEQ ID NOs:6, 14, 15 or 16. A nucleic acid fragment that is 
functionally equivalent to the instant SAMS promoter is any nucleic acid fragment that is 
capable of controlling the expression of a coding sequence or functional RNA in a similar 
manner to the SAMS promoter. The expression patterns of the SAMS promoter are defined 

25 in the following paragraphs. 

Northern-blot hybridization experiments indicated that SAMS gene transcripts are 
present in a variety of soybean tissues and that the abundance of SAMS gene transcripts does 
not differ greatly from tissue to tissue (Figure 9 and Example 3). Strong expression of the 
SAMS gene was also inferred by the high frequency of occurrences of cDNA sequences with 

30 homology to SAMS (ESTs) in a soybean cDNA sequence database created by sequencing 

random cDNAs from libraries prepared from many different soybean tissues. ESTs encoding 
SAMS can be easily identified by conducting BLAST (Basic Local Alignment Search Tool; 
Altschul, S. F., et al., (1993) J. Mol. Biol. 215:403-410; see also 
www.ncbi.nlm.nih.gov/BLAST/) searches for similarity to sequences contained in the 

35 BLAST "nr" database, e.g., SAMS from Oryza sativa (EMBL Accession No. Z26867) or 
SEQ ID NO:l provided herein. SAMS homologs were among the most abundant classes of 
cDNAs found in the soybean libraries. This indicated that SAMS was a highly expressed 
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gene in most soybean cell types. The data obtained from sequencing many SAMS ESTs also 
indicated that there were several SAMS isoforms encoded by the soybean genome. 

A soybean cDNA clone designated s2.12b06 was found to encode a protein which is 
very similar to the protein encoded by the cDNA to Oryza sativa SAMS (pLog value for this 
5 match was 61 .59). The soybean cDNA clone designated s2. 12b06 was completely sequenced 
(SEQ ID NO:l) and found to contain an opening reading frame which encodes a full length 
SAMS polypeptide. Southern hybridization analysis of soybean genomic DNA with this full 
length SAMS cDNA as a probe suggested that there are approximately four related SAMS 
genes in the soybean genome (Figure 1 A), which is consistent with the EST sequencing data. 

10 The soybean SAMS cDNA clone was used to isolate a soybean genomic DNA 

fragment containing more than 2000 nucleotides upstream (5') of the SAMS protein coding 
sequence by hybridization of a soybean genomic DNA library to the SAMS cDNA fragment 
probe. Southern hybridization analysis of soybean genomic DNA using a 13 14 base pair 
DNA fragment from upstream of the SAMS protein coding sequence as a probe indicated that 

15 this fragment is unique in the soybean genome (Figure IB). 

The promoter activity of the soybean genomic DNA fragment upstream of the SAMS 
protein coding sequence was assessed by linking the fragment to a reporter gene, the E. coli 
p-glucuronidase gene (GUS) (Jefferson (1987) Plant Mol. Biol. Rep. 5:387-405), 
transforming the SAMS promoter:: GUS expression cassette into Arabidopsis, and analyzing 

20 GUS expression in various cell types of the transgenic plants. GUS expression was detected 
in all parts of the transgenic plants that were analyzed. These results indicated that the 
nucleic acid fragment contained a constitutive promoter. Since SAMS catalyzes the reaction 
to synthesize S-adenosyl-L-methionine from methionine and ATP, free methionine levels 
might regulate SAMS promoter activity. To see if the SAMS promoter is regulated by 

25 external methionine, the SAMS::GUS transgenic Arabidopsis seeds were germinated in the 
presence or absence of methionine. Ten day old seedlings were analyzed for GUS activity 
according to the protocol described in Example 5. Ten independent transgenic lines were 
tested and all of them responded similarly. GUS activity was more than two-fold higher in 
seedlings germinated in the presence of methionine (Figure 8). The increased SAMS 

30 promoter activity in the presence of methionine may be particularly useful for efforts to 
increase methionine biosynthesis via overexpression of enzymes in the methionine 
biosynthetic pathway or the sulfate assimilation pathway. It is clear from the disclosure set 
forth herein that one of ordinary skill in the art could readily isolate a constitutive plant 
SAMS promoter from any plant by performing the following procedure: 

35 1) obtaining a SAMS cDNA from a desired plant by any of a variety of methods 

well known to those skilled in the art including, but not limited to, (a) random sequencing of 
ESTs from a cDNA library and characterizing the ESTs via a BLAST search as described 
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above; or (b) hybridizing a cDNA library to a known plant SAMS cDNA; or (c) PCR 
amplification using oligonucleotide primers designed from known SAMS cDNAs; 

2) obtaining a genomic DNA fragment that includes approximately 500 to 3000 
nucleotides from the region 5 r to a SAMS protein coding sequence, which contains a SAMS 

5 promoter, by hybridization of a genomic DNA library to a SAMS cDNA fragment probe; 

3) operably linking the nucleic acid fragment containing the region upstream 
(5') of the SAMS protein coding sequence to a suitable reporter gene ; there are a variety of 
reporter genes that are well known to those skilled in the art, including the bacterial GUS 
gene, the firefly luciferase gene, and the green fluorescent protein gene; any gene for which 

10 an easy an reliable assay is available can serve as the reporter gene 

4) transforming a chimeric SAMS promoter: reporter gene expression cassette 
into an appropriate plant for expression of the promoter. There are a variety of appropriate 
plants which can be used as a host for transformation that are well known to those skilled in 
the art, including the dicots, Arabidopsis, tobacco, soybean, oilseed rape, peanut, sunflower, 

15 safflower, cotton, tomato, potato, cocoa and the monocots, corn, wheat , rice, barley and 
palm. The terms "oilseed rape" and "oilseed Brassica" are used interchangeably herein. 

5) testing for expression of a SAMS promoter in various cell typesof transgenic 
plants, e.g., leaves, roots, flowers, seeds, transformed with the chimeric SAMS 
promoter-reporter gene expression cassette by assaying for expression of the reporter gene 

20 product. A constitutive SAMS promoter will produce high level expression of the reporter in 
all, or nearly all, of the plant tissues tested. 

In another aspect, this invention concerns a chimeric gene comprising at least one 
heterologous nucleic acid fragment operably linked to the promoter of the present invention. 
Chimeric genes can be constructed by operably linking the nucleic acid fragment of the 

25 invention, i.e., the SAMS promoter or a fragment or a subfragment that is substantially 

similar and functionally equivalent to any portion of the nucleotide sequence set forth in SEQ 
ID NOS:6, 14, 15 or 16, to a heterologous nucleic acid fragment. Any heterologous nucleic 
acid fragment can be used to practice the invention. The selection will depend upon the 
desired application or phenotype to be achieved. The various nucleic acid sequences can be 

30 manipulated so as to provide for the nucleic acid sequences in the proper orientation. 

Plasmid vectors comprising the instant chimeric genes can then be constructed. The 
choice of plasmid vector is dependent upon the method that will be used to transform host 
cells. The skilled artisan is well aware of the genetic elements that must be present on the 
plasmid vector in order to successfully transform, select and propagate host cells containing 

35 the chimeric gene. 

The plasmid vectors or chimeric genes can be used to transform plant cells. 
Transformation techniques are well known to those skilled in art as discussed above. A 
preferred method of plant cell transformation is the use of particle-accelerated or "gene gun" 

13 



transformation technology (Klein et ah (1978) Nature (London) ¥11\1§-1?>\ U.S. Patent 
No. 4,945,050). The chimeric gene will normally be joined to a marker for selection in plant 
cells. The marker may be resistance to a biocide, particularly an antibiotic, such as 
kanamycin, G418, bleomycin, hygromycin, chloramphenicol, or the like. The particular 
5 marker employed will be one which will allow for selection of transformed cells as compared 
to cells lacking the heterologous nucleic acid sequence which has been introduced. Examples 
of plant cells which can be transformed using plant transformation techniques include, but are 
not limited to, monocot and dicot plant cells such as soybean, oilseed Brassica species, corn, 
peanut, rice, wheat, sunflower, safflower, cotton,cocoa,tobacco,tomato, potato, barley, palm, 

1 0 Arabidopsis and the like. 

In addition to the bacterial GUS gene, two soybean genes, ATP sulfurylase (ATPS) 
and cystathionine-p-synthase 1 (CGS1), were also successfully expressed by this promoter in 
transgenic Arabidopsis, as depicted in Figure 7. This further validates the application of the 
SAMS promoter of the invention in plant genetic engineering practice. 

15 The skilled artisan will also recognize that different independent transformation 

events will result in different levels and patterns of expression of the chimeric genes (Jones et 
ah, (1985) EMBO J. 4:241 1-2418; De Almeida et ah, (1989) Mol Gen. Genetics 27S:78-86). 
Thus, multiple events must be screened in order to obtain lines displaying the desired 
expression level and pattern. Such screening may be accomplished by northern analysis of 

20 mRNA expression, western analysis of protein expression, or phenotypic analysis. Also of 
interest are seeds obtained from transformed plants displaying the desired expression profile. 

The level of activity of the SAMS promoter is comparable to that of many known 
strong promoters, such as the CaMV 35S promoter (Atanassova et ah, (1998) Plant Mol 
Biol 37:275-285; Battraw and Hall, (1990) Plant Mol Biol 75:527-538; Holtorf et ah, 

25 (1995) Plant Mol Biol 22:637-646; Jefferson et ah, (1987) EMBO J. (5:3901-3907; Wilmink 
et ah, (1995) Plant Mol Biol 25:949-955), the Arabidopsis oleosin promoters (Plant et ah, 
(1994) Plant Mol Biol 25:193-205; Li, (1997) Texas A&M University Ph.D. dissertation, 
pp. 107-128), the Arabidopsis ubiquitin extension protein promoters (Callis et ah, 1990), a 
tomato ubiquitin gene promoter (Rollfinke et ah, 1998), a soybean heat shock protein 

30 promoter (Schoffl et ah, 1989), and a maize H3 histone gene promoter (Atanassova et ah, 
1998). 

Expression of the chimeric genes in most plant cell makes the SAMS promoter of the 
instant invention especially useful when constitutive expression of a target heterologous 
nucleic acid fragment is required. Examples of suitable target heterologous nucleic acid 
35 fragments include, but are not limited to, a herbicide-resistance or pathogen-resistance 

nucleic acid fragment. Another useful feature of the constitutive plant SAMS promoter is its 
expression profile in developing seeds. The SAMS promoter of the invention is most active 
in developing seeds at early stages and gradually turns down at later stages. Such activity is 



indicated by the GUS activity detected in seeds of transgenic Arabidopsis plants containing a 
SAMS::GUS expression cassette as shown in Figures 4 and 5. The expression profile of the 
claimed SAMS promoter is different from that of many seed- specific promoters, e.g., seed 
storage protein promoters, which often provide highest activity in later stages of development 
5 (Chen et al, (1989) Dev. Genet 70:112-122; Ellerstrom et a/., (1996) Plant Mol Biol 

32:1019-1027; Keddie et aL, (1994) Plant Mol Biol 24:327-340; Plant et al. 9 (1994) Plant 
Mol Biol 25:193-205; Li, (1997) Texas A&M University Ph.D. dissertation, pp. 107-128). 
Thus, the SAMS promoter will be a very attractive candidate when overexpression of a gene 
in embryos is desired at an early developing stage. For example, it may be desirable to 

10 overexpress a gene regulating early embryo development or a gene involved in the 
metabolism prior to seed maturation. 

One general application of the SAMS promoter of the invention is to construct 
chimeric genes that can be used in the selection of transgenic cell lines in plant 
transformation. Currently, many of the selectable marker genes for plant transformation are 

15 under the control of the cauliflower mosaic virus 35S promoter. Since the SAMS promoter 
of the invention is active in seedlings and callus, the appropriate selection phase for 
transgenic plants or cell lines, this promoter may be used as an alternative to the 35S 
promoter to drive the expression of selectable marker genes. 

Another general application of the SAMS promoter of the invention is to construct 

20 chimeric genes that can be used to reduce expression of at least one heterologous nucleic acid 
fragment in a plant cell. To accomplish this a chimeric gene designed for cosuppression of a 
heterologous nucleic acid fragment can be constructed by linking the fragment to the SAMS 
promoter of the present invention. (See U.S. Patent No. 5,231,020 for methodology to block 
plant gene expression via cosuppression.) Alternatively, a chimeric gene designed to express 

25 antisense RNA for a heterologous nucleic acid fragment can be constructed by linking the 
fragment in reverse orientation to the SAMS promoter of the present invention. (See U.S. 
Patent No. 5,107,065 for methodology to block plant gene expression via antisense RNA.) 
Either the cosuppression or antisense chimeric gene can be introduced into plants via 
transformation. Transformants wherein expression of the heterologous nucleic acid fragment 

30 is decreased or eliminated are then selected. 

This invention also concerns a method of increasing or decreasing the expression of at 
least one heterologous nucleic acid fragment in a plant cell which comprises: 

(a) transforming a plant cell with the chimeric genes described herein; 

(b) growing fertile mature plants from the transformed plant cell of step (a); 

35 (c) selecting plants containing a transformed plant cell wherein the expression of 

the heterologous nucleic acid fragment is increased or decreased. 

Transformation and selection can be accomplished using methods well-known to 
those skilled in the art including, but not limited to, the methods described herein. 

15 



EXAMPLES 

The present invention is further defined in the following Examples. From the above 
discussion and these Examples, one skilled in the art can ascertain the essential characteristics 
of this invention, and without departing from the spirit and scope thereof, can make various 
5 changes and modifications of the invention to adapt it to various usages and conditions. 

Unless otherwise stated, all parts and percentages are by weight and degrees are 
Celsius. Techniques in molecular biology were typically performed as described in Ausubel, 
F. M, et aL, (1990, Current Protocols in Molecular Biology, John Wiley and Sons, New 
York) or Sambrook, J. et aL, (1989, Molecular cloning - A Laboratory Manual, 2 nd ed. Cold 
10 Spring Harbor Laboratory Press, Cold Spring Harbor, New York). 

EXAMPLE 1 

Composition of cDNA Libraries: Isolation and Sequencing of cDNA Clones 
cDNA libraries representing mRNAs from soybean tissues were prepared in Uni- 
ZAP XR™ vectors according to the manufacturer's protocol (Stratagene, La Jolla, CA). 

1 5 Conversion of the Uni-ZAP XR™ libraries into plasmid libraries was accomplished 

according to the protocol provided by Stratagene. Upon conversion, cDNA inserts were 
contained in the plasmid vector pBluescript™ (Stratagene). DNA was prepared for 
sequencing from randomly selected bacterial colonies containing recombinant pBluescript™ 
plasmids either by amplifying the cDNA inserts via polymerase chain reaction using primers 

20 specific for vector sequences flanking the cloning site or by preparing plasmid DNA from 
cultured bacterial cells. Amplified insert DNAs or plasmid DNAs were sequenced in dye- 
primer sequencing reactions using a Perkin Elmer Model 377 fluorescent sequencer to 
generate partial cDNA sequences termed expressed sequence tags or "ESTs" (see Adams, 
M. D. et aL, (1991) Science 252:1651). 

25 EXAMPLE 2 

Identification of SAMS cDNA Clones 
ESTs encoding SAMS were identified by conducting BLAST (Basic Local Alignment 
Search Tool; Altschul, S. F„ et aL, (1993) J. Mol Biol 2/5:403-410; see also 
www.ncbi.nlm.nih.gov/BLAST/) searches for similarity to sequences contained in the 

30 BLAST "nr" database (comprising all non-redundant GenBank CDS translations, sequences 
derived from the 3-dimensional structure Brookhaven Protein Data Bank, the last major 
release of the SWISS-PROT protein sequence database, EMBL, and DDBJ databases). The 
cDNA sequences obtained in Example 1 were analyzed for similarity to all publicly available 
DNA sequences contained in the "nr" database using the BLASTN algorithm provided by the 

35 National Center for Biotechnology Information (NCBI). The DNA sequences were translated 
in all reading frames and compared for similarity to all publicly available protein sequences 
contained in the "nr" database using the BLASTX algorithm (Gish, W. and States, D. J. 
(1993) Nature Genetics 3:266-272 and Altschul, S. F., etal (1997) Nucleic Acids Res. 



25:3389-3402) provided by the NCBL For convenience, the P-value (probability) of 
observing a match of a cDNA sequence to a sequence contained in the searched databases 
merely by chance as calculated by BLAST are reported herein as "pLog" values, which 
represent the negative of the logarithm of the reported P-value. Accordingly, the greater the 
5 pLog value, the greater the likelihood that the cDNA sequence and the BLAST "hit" 
represent homologous proteins. 

The BLASTX search using the nucleotide sequence from clone s2.12b06 revealed that 
this nucleotide sequence encoded a protein that was similar to the protein encoded by the 
cDNA to Oryza sativa (EMBL Accession No. Z26867) S-adenosylmethionine synthetase; the 

10 pLog value for this match was 61.59. This cDNA clone was completely sequenced (SEQ ID 
NO:l) and found to contain an opening reading frame ranging from nucleotides 74 to 1252 
which is predicted to encode a full length SAMS polypeptide. 

A high level of expression of the SAMS genes was inferred by the high frequency of 
occurrences of soybean cDNA sequences with homology to Oryza sativa SAMS obtained 

15 from many different cDNA libraries prepared from many different soybean cell types. 

SAMS homologs were the third most abundant class of ESTs found in the soybean libraries. 
Although the ranking might not represent a precise estimate of the relative abundance of the 
SAMS transcripts in vivo in all soybean libraries, due to the selective use of different cDNA 
libraries, it did indicate that SAMS was a highly expressed gene. The EST sequence data 

20 also revealed that there were several SAMS isoforms in the soybean genome. 

EXAMPLE 3 

S-adenosylmethionine Synthetase is Encoded by a Gene Family 
Southern hybridization analysis of soybean genomic DNA with a full length SAMS 
cDNA (SEQ ID NO:l) as a probe suggested that there are at least four related SAMS genes in 

25 the soybean genome (Figure 1 A). The DNA probe for Southern hybridization was prepared 
as follows: plasmid DNA was prepared from an overnight bacteria culture in LB broth 
(GIBCO BRL, Gaithersburg, MD) using QIAprep™ miniprep kit (Qiagen, Valencia, CA); 
cDNA inserts encoding SAMS were excised by restriction enzyme digestion and recovered 
from agarose gel following electrophoretic separation using QI Aquick™ gel extraction kit 

30 (Qiagen). The 1518 bp SAMS cDNA fragment (SEQ ID NO:l) was labeled with 

digoxigenin-dUTP as a probe by random primed DNA labeling (Boehringer Mannheim). 
Twenty micrograms of soybean geneomic DNA was digested with different restriction 
enzymes and the resulted fragments were resolved on a 0.7% agarose gel. The DNA gel was 
depurinated in 0.25 M HC1, denatured in 0.5 M NaOH/1.5 M NaCl, neutralized in 1 m 

35 Tris-Cl, pH 8.0/1.5 M NaCl, and transferred in 20x SSC (GIBCO BRL) to nylon membrane 
(Boehringer Mannheim). The Southern blot was hybridized with the SAMS cDNA-specific 
probe at 45°C overnight in Easy Hyb (Boehringer Mannheim). The blot was washed 
10 minutes in 2xSSC/0.1% SDS, and 3x 10 minutes in O.lx SSC/0.1% SDS at 65°C. The 



hybridized probe was detected with chemiluminescent reagent CDP-Star (Boehringer 
Mannheim) according to the manufacturer's protocol. Multiple bands were detected in 
BamHI, EcoRI, and Hindlll digestions (Figure 1 A). The large band in Kpnl and Sad 
digestions may represent more than one DNA fragment because the band is too big for good 
5 resolution. The hybridization patterns presented in Figure 1 A and the analysis of partial 
SAMS cDNA sequences from DuPont's EST database suggest that there are at least four 
copies of the SAMS gene in the soybean genome and that their sequences are conserved. 

The 1314 bp SAMS promoter fragment (SEQ ID NO:6) was labeled with 
digoxigenin-dUTP also by random primed DNA labeling (Boehringer Mannheim). The 

10 labeled SAMS promoter probe was used to hybridize the same Southern blot as above 

described. The SAMS promoter-specific probe hybridized to a single band in each of the five 
different digestions, BamHI, EcoRI, Hindlll, Kpnl, and Sad (Figure IB). The results 
indicate that the SAMS promoter has only a single copy in soybean genome. 

A northern hybridization experiment indicated that SAMS gene transcripts were 

1 5 present in a variety of soybean tissues and that the abundance of SAMS gene transcripts did 
not differ greatly from tissue to tissue. Total RNAs were extracted from soybean leaves, 
stems, young seeds, medium seeds, old seeds, and pod coats using Trizol™ Reagent 
according to the manufacturer's protocol (GIBCO BRL). Ten micrograms of total RNA were 
loaded in each well of a 1.2% agarose gel containing 7% formaldehyde in lx MOPS buffer, 

20 20 mM 3-[N-morpholino]propane-sulfonic acid, 5 mM sodium acetate, 1 mM EDTA, pH 6.0. 
RNA was transferred to nylon filters (Micron Separations Inc., Westborough, MA) in 10X 
SSC and crosslinked to the filters with UV light. Filters were hybridized with probes 
prepared from cDNA insert fragments in 50% deionized formamide, 5x SSPE, lx Denhardt's 
solution, 0.1% SDS, and 100 ^ig denatured salmon sperm DNA (Sigma, St. Louis, MO) at 

25 42° for 24 hours. Filters were washed in 2x SSPE and 0. 1% SDS at room temperature for 
10 minutes, lx SSPE and 0.1% SDS at 65° for 10 minutes, and then in O.lx SSPE and 0.1% 
SDS at 65° for 10 minutes. Filters were exposed to Kodak X-ray film at -80. The abundance 
of SAMS transcripts in leaves, roots, stems, young seeds, medium seeds, old seeds, and pod 
coats can be seen in Figure 9. The weak signals observed in the hybridizations to RNA 

30 samples from root and young seed were attributed to underloading, because hybridizations 
with ribosomal RNAs that serve as internal controls were also relatively weak in those 
samples (data not shown). Because of the high sequence similarities among the four SAMS 
gene isoforms, this RNA gel blot was not able to indicate how the isoforms were distributed 
in any particular tissue. However, the experiment demonstrated that all examined soybean 

35 tissues contained SAMS messenger RNA. 
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EXAMPLE 4 

Cloning of the Soybean S-adenosvlmethionine Synthetase Gene Promoter 
The soybean full length SAMS cDNA (SEQ ID NO:l), obtained in Example 2, was 
used to generate a probe to isolate a SAMS promoter. The full length SAMS cDNA 
5 sequence consisted of 1 5 1 8 bp, and it had a 74 bp 5'-untranslated region and a PstI site at 
position 296. Because the cDNA clone was harbored in a pBluescript™ SK vector having a 
PstI site upstream of the EcoRI cloning site, digestion of the clone with Pstl generated a 
3 1 5 bp fragment of DNA. The resulting restriction fragment contained 1 9 bp of vector and 
cloning linker adapter sequence in addition to the 296 bp of SAMS cDNA sequence. This 

10 PstI fragment was labeled with a- 32 P-dCTP, as described in Example 3, and used as a probe 
to screen a soybean genomic DNA library that had been constructed in a EMBL3 SP6/T7 
vector (ClonTech, Palo Alto, CA). The library was plated with LE392 (ClonTech) cells at 
50,000 plaque forming units (pfu) per 150 mm NZCYM agar plate (GIBCO BRL). Plaques 
were transferred to Hybond nylon membranes, and the plaque replicas were then denatured 

15 and neutralized according to the manufacturer (Amersham Life Science, Arlington Heights, 
IL). The phage DNA was fixed on the membranes by UV-crosslinking (Stratagene). After 
prehybridization at 65° for 1 hour in 0.5 M NaHP0 4 , pH 7.2, 1 mM EDTA, 1% crystalline 
BSA (Sigma), and 7% SDS, the SAMS 315 bp Pstl fragment probe was denatured in boiling 
water bath for 5 minutes and added to the same hybridization solution, and was hybridized at 

20 65° for 24 hours. The membranes were washed in 40 mM NaHP0 4 , pH 7.2, 1 mM EDTA, 
0.5% crystalline BSA, and 5% SDS for 10 minutes at room temperature, and then 3x 
10 minutes at 65° in 40 mM NaHP0 4 , pH 7.2, 1 mM EDTA, and 1% SDS. The membranes 
were exposed to Kodak X-ray film (Sigma) at -80°. Positive SAMS genomic DNA phage 
clones were suspended in SM buffer, 50 mM Tris-Cl, pH 7.5, 100 mM NaCl, 0.2% 

25 MgS0 4 «7H 2 0, and 0.1% gelatin, and purified by a secondary screening following the same 
procedure. Twenty three strongly hybridizing plaques were identified by the first screening 
from a total of 3x1 0 5 pfu, and fifteen were later purified. DNAs were prepared from two of 
the purified phage clones (Ausubel et al. 5 (1990) pp. 1.13.4-1.13.8), they were digested with 
BamHI, Clal, Pstl, and Ncol and prepared for a Southern blot. The blot was hybridized with 

30 the SAMS 3 15 bp Pstl fragment probe prepared and used as above. A single positive 

fragment of clone 1 was identified from the Clal digestion. Since the Clal restriction site in 
the cDNA clone is 843 bp from the 5' end of the full length cDNA, the 2.5 kb Clal fragment 
was expected to include about 1.7 kb of DNA upstream of the coding sequence, which was 
considered sufficient to contain the SAMS promoter. 

35 The 2.5 kb Clal genomic DNA fragment was cloned into pBluescript™ KS and the 

DNA insert was sequenced. The 3' end sequence of the genomic DNA fragment was 
expected to match the 5' end sequence of SAMS cDNA from the 5' end to the Clal site at 
position 843. However, comparison of the genomic DNA sequence and the cDNA sequence 



revealed that the two sequences have 191 bp of overlapping sequence starting at position 54 
and ending at position 245 of the cDNA sequence (SEQ ID NO:l). The sequence of the 
2.5 kb genomic DNA clone downstream of the 191 bp overlapping region was determined to 
be derived from the cloning vector, lambda EMBL3 SP6/T7, which contributed 257 bp of 
5 sequence to the 3' end of the 2.5 kb SAMS Clal fragment including the Clal cloning site. 

Therefore, the soybean derived DNA in the 2.5 kb Clal fragment is described by the 2336 bp 
DNA sequence shown in SEQ ID NO:2. 

The DNA sequence of the genomic DNA in the 191 bp region (from nucleotide 2145 
to the end of the sequence) was very similar to, but did not match perfectly, the cDNA 

10 sequence; there were six base pair mismatches in this region. This was not surprising, 
because it was known from the experiments described in Example 3 that there is a small 
family of SAMS genes in soybean. It was concluded that this genomic clone is not derived 
from the same gene from which the cDNA used as the probe was transcribed. It was also 
noted that the 53 bp at the 5' end of the cDNA did not show any similarity to the genomic 

15 sequence upstream of the 191 bp overlapping region (Figure 2). 

A BLASTN search of the DuPont soybean EST database using the nucleotide 
sequence from the soybean SAMS genomic DNA upstream of the 191 bp region revealed 
many cDNA clones that matched a 60 bp region of the genomic DNA from nucleotide 1496 
to 1555. The sequence of one such cDNA, designated srrlc.pk002.b21, is shown in SEQ ID 

20 NO:3. 

The cDNA sequence in SEQ ID NO:3 perfectly matches the genomic sequence in 
SEQ ID NO:2 from nucleotide 2 to 60 of the cDNA. There follows a region of 591 
nucleotides in the genomic DNA that is absent from the cDNA. Then the region from 
nucleotide 60 to 250 of the cDNA perfectly matches the 191 bp region at the 3' end of the 

25 genomic DNA. This indicates the presence of a 591 nucleotide intron in the genomic DNA in 
the 5' transcribed, but untranslated, region of the SAMS gene. The presence of consensus 5' 
and 3' splice junctions in the genomic DNA at the exon-intron junctions supports this 
conclusion. Thus, the 53 bp at the 5' end of the cDNA used as the probe (SEQ ID NO: 1) did 
not match the genomic sequence because the genomic sequence at that position in the 

30 alignment was from the intron. However, the 53 bp at the 5' end of the cDNA of SEQ ID 
NO:l is very similar to the 60 nucleotides at the 5' end of the cDNA of SEQ ID NO:3, 
suggesting that the gene from which SEQ ID NO:l was transcribed also contains an intron at 
the analogous position. 

A 1305 bp SAMS genomic DNA fragment starting at nucleotide 856 and ending at 

35 nucleotide 2160 of SEQ ID NO:2: was amplified by PCR from the 2.5 kb Clal clone. The 
promoter fragment was amplified from this fragment using primers sam-5 (SEQ ID NO:4) 
and sam-6 (SEQ ID NO:5) and Pfu DNA polymerase (Stratagene). 



20 



CATGCCATGGCTTTATACTTCAAAAACTGCAC (SEQ ID N0:4) 

GCTCTAGATCAAACTCACATCCAA (SEQ ID NO:5) 



5 An Xbal site and an Ncol site were introduced to the 5' end and 3' end, respectively, of the 
PCR fragment by using these specifically designed primers. The Ncol site includes the ATG 
start codon of the SAMS coding region. The resulting 1314 bp fragment is shown in SEQ ID 
NO:6 and includes the SAMS promoter and the translation leader region, which is interrupted 
by the 591 nucleotide intron. 

1 0 Using PCR amplification procedures and appropriate primers additional SAMS 

promoter fragments can be produced from the 2336 nucleotide fragment of SEQ ID NO:2. 
These include, but are not limited to, the three fragments provided in SEQ ID NOs:14, 15 and 
16. SEQ ID NO: 14 is a 2165 nucleotide sequence of a SAMS promoter DNA fragment 
which starts at the 5' end of the 2336 nucleotide sequence of SEQ ID NO:2 and ends at the 

1 5 ATG translation start codon of the SAMS protein. SEQ ID NO: 15 is a 1 574 nucleotide 
sequence of a SAMS promoter DNA fragment which starts at the 5' end of the 2336 
nucleotide sequence of SEQ ID NO:2 and ends at the ATG translation start codon of the 
SAMS protein, and from which the 591 nucleotide long intron sequence has been removed. 
SEQ ID NO: 16 is a 719 nucleotide sequence of a SAMS promoter DNA fragment which 

20 starts at nucleotide 4 of SEQ ID NO: 6 and ends at the ATG translation start codon of the 
SAMS protein, and from which the 591 nucleotide long intron sequence has been removed. 

EXAMPLE 5 

Expression of the GUS Gene by the SAMS Promoter in Arabidovsis 
The activity of the soybean SAMS promoter was tested by its ability to express the 

25 GUS reporter gene in transgenic Arabidopsis plants carrying the SAMS promoter: :GUS: :3' 
Nos expression casstette. GUS refers to the E. coli p-glucuronidase gene (GUS) (Jefferson, 
(1987) Plant MoL Biol Rep. 5:387-405) and 3' Nos refers to the transcription termination 
region from the nopaline synthase (Nos) gene (Depicker et al (1982) J. MoL Appl Genet. 
7:561-570). The SAMS promoter fragment (SEQ ID NO:6) was digested with Xbal and 

30 Ncol and inserted into plasmid pMH40A (SEQ ID NO: 17), which contained a 35S 

promoter::GUS::3' Nos plant expression cassette. The Xbal/Ncol SAMS promoter DNA 
fragment replaced the 35S promoter of pMH40A, to form the pZSLl 1 plasmid (Figure 3). 
The SAMS promoter: :GUS::3 f Nos DNA fragment (SEQ ID NO:18) was excised from 
pZSLl 1 by Hindlll and Sad digestion and transferred into the corresponding sites of pBHOl 

35 (ClonTech) binary vector. The cloned SAMS promoter was sequenced to verify that no 
sequence error was generated by the PCR amplification. 

The SAMS:: GUS expression cassette was introduced into wild type Arabidopsis 
thaliana by Agrobacteria mediated transformation. A. thaliana ecotype Columbia were 
grown in 228 chamber with continuous light and transformed by vacuum infiltration method 



using GV3101 Agrobacteria (Bent, A. et aU (1994) Science 2(55:1856-1860). Transformed 
Arabidopsis seeds were selected by germination on Murashige and Skoog minimal salt 
(GIBCO BRL) plus 0.2 % phytagel (Sigma), 1% sucrose, and 100 mg/ml kanamycin. The 
kanamycin resistant seedlings were transferred into soil and grown in 228 chamber under 
5 continuous light. 

For histochemical GUS staining, plant tissues were incubated in 0.5% 5-bromo-4- 
chloro-3-indoxyl-P-D-glucuronic acid (X glue, Biosynth AG, Switzerland) in 50 mM sodium 
phosphate, pH 7.0, 10 mM EDTA, 0.5 mM potassium ferricyanide, and 0.5 mM potassium 
ferrocyanide at 378 overnight, and then chlorophyll was removed with 75% ethanol. Pictures 

1 0 were taken using a Nikon dissecting microscope. Strong GUS expression was detected in all 
the parts of the transgenic Arabidopsis plants, including flowers (Figure 4 A), leaves 
(Figure 4B), stems (bolt) (Figure 4C), silique coats and developing seeds (Figure 4D-F), 
developing embryos (Figure 4G), and seedlings (not shown). The GUS staining on leaves 
and silique coats was uniform with all the veins and mesophyll tissues similarly stained, 

1 5 while staining on flowers and stems was not uniform. Although some seeds were not stained 
for GUS activity due to genetic segregation, the funiculi that connected these seeds to the 
silique coat stained positively for GUS activity (Figure 4G). These results indicated that the 
soybean SAMS promoter was a constitutive promoter and was able to function in 
heterologous plant. 

20 The GUS activities of the transgenic Arabidopsis plants were further analyzed by a 

fluorometric assay. For fluorescence analysis, plant tissues were ground in microfuge tubes 
with extraction buffer, 50 mM phosphate buffer, pH 7.0, 10 mM EDTA, 0.1% Triton X-100, 
0.1% N-lauroyl sarcosine, and 10 mM p-mercaptoethanol, to homogeneity. The samples 
were centrifuged at 14,000 rpm for 10 minutes, and aliquots of the supernatant were used to 

25 determine protein concentrations by the Bradford method (Bio-Rad, Hercules, CA) using 
96 well microtiter plates read with a kinetic microplate reader (Molecular Devices, 
Sunnyvale, CA). The (3-glucuronidase activities were analyzed by standard protocol 
(Jefferson et al, (1987) EMBO J. 5:3901-3907) using 96 well microtiter plates read with 
Cytofluor multiwell plate reader (PerSeptive Biosystems, Framingham, MA). Data were 

30 entered into a Microsoft Excel spread sheet and analyzed. Triple samples of flower, leaf, 

stem, silique coat, young seed (white), medium seed (light green), old seed (dark green), and 
dry seed from six plants were analyzed. The soybean SAMS promoter was active in all the 
tissues analyzed (Figure 5). Promoter activity varied among the six lines, as is typically seen 
among plant transformants. The basic expression patterns were similar among all the lines, 

35 and the average SAMS promoter activity was comparable to that of the 35 S promoter 

(Battraw and Hall, (1990) Plant Mol Biol 75:527-538; Jefferson et al, (1987) EMBO J. 
5:3901-3907; Atanassova et al. 9 (1998) Plant Mol Biol 37:275-285; Holtorf et al, (1995) 
Plant Mol Biol 29:637-646; Wilmink a/., (1995) Plant Mol Biol 25:949-955). The 



SAMS promoter was very active in developing seeds, especially in early and medium stages 
of development, and the GUS specific activities are in the range of 5-40 pmole 4-Mu 
(4-methylumbelliferone) per microgram protein per minute, which are comparable to many 
strong promoters (Atanassova et al, (1998) Plant Mol Biol 37:275-285; Comai et al, (1990) 
5 Plant Mol Biol 75:373-381; Holtorf et al, (1995) Plant Mol Biol 29:637-646; Wilmink et 
al, (1995) Plant Mol Biol 25:949-955). 

EXAMPLE 6 
Ex pression of GUS Gene by SAMS Promoter in Corn 
In order to test whether the dicot SAMS promoter also worked in monocot plants, 
10 pZSLl 1 was introduced into corn leaf discs and callus by gene bombardment for transient 
gene expression assay using the biolistic particle delivery system PDS-1000/He (Bio Rad, 
Hercules, CA). The pMH40A plasmid DNA (as set forth in SEQ ID NO:17), which 
contained the 35S promoter and GUS reporter gene, was also introduced into corn callus and 
leaf discs by gene bombardment to serve as a positive control vector. After incubation 
15 overnight at 37°, bombarded tissues were stained for GUS activity. GUS expression was 
demonstrated by the blue spots on both the callus (Figure 6A) and leaf discs (Figure 6B) 
bombarded with pZSLl 1. As expected, the positive control 35S::GUS cassette was also 
expressed in both callus and leaf discs (Figure 6C, D). 

EXAMPLE 7 

20 Expression of Methionine Biosynthesis Genes by SAMS Promoter 

The SAMS promoter was fused to two soybean cDNAs, one encoding ATP 
sulfurylase (ATPS) and a second encoding cystathionine-y-synthase (CGS1). The soybean 
ATPS and CGS1 cDNAs were isolated from soybean embryo cDNA libraries using the same 
procedures as described in Example 1 and Example 2 for isolation of soybean SAMS cDNAs. 

25 The coding regions and the 3' untranslated region (UTR) of soybean ATPS and CGS1 genes 
were inserted into pZSLl 1 replacing the GUS gene. The resulting SAMS promoter:: ATPS 
and SAMS promoter: :CGS1 expression cassettes, SEQ ID NO:19 and SEQ ID NO:20, 
respectively, were inserted into binary vectors for Arabidopsis transformation and 
transformation was performed as described in Example 5. Transgenic Arabidopsis plants 

30 with soybean ATPS and CGS1 genes controlled by the SAMS promoter were analyzed by 
PCR for the presence of the transgenes and by RT-PCR for expression of the transgenes. 
Genomic DNA used for PCR analysis was prepared from Arabidopsis siliques and leaves 
using 7 M urea, 1 .5 M NaCl, 50 mM Tris, pH 8.0, 20 mM EDTA, and 1% N-lauroyl- 
sarcosine, followed by phenol extraction and ethanol precipitation. Primer sam-9 (SEQ ID 

35 NO:7) which is specific to SAMS promoter, and primers specific to the target genes, atps-1 
(SEQ ID NO:8) for the ATPS gene and cgs-8 (SEQ ID NO:9) for the CGS1 gene were used 
in PCR with Taq DNA polymerase (GIBCO BRL) to detect the existence of SAMS:: ATPS 
and SAMS::CGS1 in transgenic Arabidopsis plants. 

23 
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TTCGAGTATAGGTCACAATAGG 

CTTCGCTGAGGACATGGAC 

GAGTTGTCGCTGTTGTTCGAC 



(SEQ ID NO:7) 
(SEQ ID N0:8) 
(SEQ ID N0:9) 



RNA samples used for RT-PCR were prepared with Trizol™ Reagent (GIBCO BRL). 
Antisense primers atps-4 (SEQ ID NO:10)and cgs-10 (SEQ ID NO:l 1) were used in reverse 
10 transcription reactions with Superscriptll™ RT (GIBCO BRL) following the vendor's 
instruction. 

AACACAGCATCCGCATTGCG (SEQ ID NO:10) 

1 5 AGGAGTGCAGAATCAGATCAG (SEQ ID NO: 1 1) 

The first strand cDNAs were used in PCR with primer pairs atps-3 (SEQ ID NO: 12) and 
atps-4 (SEQ ID NO:10) for SAMS::ATPS transgenic plants, and cgs-9 (SEQ ID NO:13) and 
cgs-10 for SAMS::CGS1 transgenic plants. PCR and RT-PCR products were resolved by 
20 agarose gel electrophoresis. 

GCTGATCGAACCAGATGGAG (SEQ ID NO: 12) 

CTGTACAGTTAAACAGTAGTTCT (SEQ ID NO: 1 3) 

25 

All ten SAMS::CGS1 transgenic Arabidopsis harbored the SAM::CGS1 expression 
cassette as revealed by PCR with SAMS ::CGS1 -specific primers (Figure 7A). It was also 
revealed by the same analysis that all the ten SAMS::ATPS transgenic Arabidopsis plants 
contained the SAMS::ATPS expression cassette (Figure 7A). RT-PCR analysis detected 

30 CGS1 transcripts and ATPS transcripts, respectively, in most of the transgenic plants 
(Figure 7B). This shows that the SAMS promoter is capable of driving expression of a 
variety of different genes in most or all cell types in transformed plants. 

EXAMPLE 8 
Induction of SAMS Promoter Activity by Methionine 

35 Since SAMS catalyzes the reaction to synthesize S-adenosyl-L-methionine from 

methionine and ATP, free methionine levels might regulate SAMS promoter activity. To see 
if SAMS promoter is regulated by external methionine, the SAMS::GUS transgenic 
Arabidopsis seeds were germinated in the presence of either H 2 0, Ix Murashige and Skoog 
salt (GIBCO BRL), 0.01 mM methionine (Sigma), or 1 mM methionine. Ten days old 

40 seedlings from ten independent transgenic lines were analyzed for GUS activity according to 
the protocol described in Example 5. GUS activity for each treatment, in the order given 
above, for each transgenic line is shown in Figure 8. All lines responded similarly to the 
different treatments. Compared to the control of H 2 0 treamtment, SAMS activity was 



24 



induced more than two-fold by 0.01 mM free methionine and inhibited about 40% on average 
by lx MS salt. The induction effect of SAMS promoter by 1 mM methionine was less than 
that by 0.01 mM methionine, probably due to a toxic effect of the high methionine 
concentration; this toxic effect was indicated by the smaller sizes and shorter roots of the 
seedlings grown in the presence of 1 mM methionine. The toxic effect of high levels of 
methionine was even more apparent at 10 mM free methionine, since only a few Arabidopsis 
seeds were able to germinate and none survived in the presence of 10 mM free methionine. 
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CLAIMS 

What is claimed is: 

1 . An isolated nucleic acid fragment comprising a promoter wherein said promoter 
consists essentially of the nucleotide sequence set forth in SEQ ID NOs:6, 14, 15, or 16 or 

5 said promoter consists essentially of a fragment or subfragment that is substantially similar 
and functionally equivalent to the nucleotide sequence set forth in SEQ ID NOs:6 5 14, 15, or 
16. 

2. A chimeric gene comprising at least one heterologous nucleic acid fragment 
operably linked to the promoter of Claim 1 or Claim 10. 

10 3 . A plant containing the chimeric gene of Claim 2. 

4. The plant of Claim 3 wherein said plant is a monocot selected from the group 
consisting of corn, rice, wheat, barley and palm. 

5. The plant, of Claim 3 wherein said plant is a dicot selected from the group 
consisting of Arabidopsis, soybean, oilseed Brassica, peanut, sunflower, safflower, cotton, 

*j 15 tobacco, tomato, potato, and cocoa. 

*S 6. The plant of claim 5 wherein said plant is soybean, 

|fl 7, Seeds of the plants of Claims 3, 4, 5, or 6. 

" 8. A method of increasing or decreasing the expression of at least one heterologous 

fLI nucleic acid fragment in a plant cell which comprises: 

iy 20 (a) transforming a plant cell with the chimeric gene of Claim 2; 

u (b) growing fertile mature plants from the transformed plant cell of step (a); 

rtJ (c) selecting plants containing a transformed plant cell wherein the expression of 

In the heterologous nucleic acid fragment is increased or decreased. 

□ 9. The method of Claim 8 wherein the plant is a monocot selected from the group 

M 25 consisting of com, rice, wheat, barley and palm. 

10. The method of Claim 9 wherein the plant is a dicot selected from the group 
consisting of Arabidopsis, soybean, oilseed Brassica, peanut, sunflower, safflower, cotton, 
tobacco, tomato, potato, and cocoa. 

1 1 . The method of Claim 10 wherein the plant is soybean. 

30 12. An isolated nucleic acid fragment comprising a constitutive plant SAMS 

promoter. 
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TITLE 

S-ADENOSYL-L-METHIONINE SYNTHETASE PROMOTER 
AND ITS USE IN EXPRESSION OF TRANSGENIC GENES IN PLANTS 
ABSTRACT OF THE DISCLOSURE 
5 A constitutive plant S-adenosyl-L-methionine synthetase (SAMS) promoter and 

subfragments thereof and their use in promoting the expression of one or more heterologous 
nucleic acid fragments in plants are described. 
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FIGURE 2 



ATCGATAGAGACATGTTATTCACAAACCATAAAATGATGGCTAAAATTGGTGTGATTGGAACGATATCTGTTTATTATGA 
TTTCAGGGCGCAAAAATGCGAGTACTTAATAAAATTTTACATTTAAATTAGAATTTTTTTTATCAATAAATATTAATTTA 
TTAGTTTTATTAGAAATATTAATTAGAAAATTTTGAATCCCCGATTTCTCCTCCTTTTCTTCGCTATTCATCATTTTCTA 
ACCAAACCAATCTTATATGTTCTTCAAATTAGAACTTGAAATTATTAATTATAATTAAACTGAAAACAATTTGGTATCAA 
TTCATATACATGCTTAGTAATAAAATGCGATAATTAATTGATAAATCTGCAAAAGATTTTACAAATATCTTTCAGAAAAA 
ATTAATAACAAATTTTGTCGTTTTCATGGTGTTGGTCTGAGGAGGATTTGGCACTATAGAACTCTCCTACGGACCATTCT 
TTGCACTTCAACTAAACGATGGTCAGAATTGGTGGGGATTTTATATTCAAGCATATCCCTTTCAAAACTTCCTACTTACT 
TCGTGCGTTCGG7AATCGGTAACATTAGACTTTCAAAATCATTTTTAACCCCTAAACAGTAAATTTGAAGGACAAAAATA 
ATATTTTTCAAATTTGATAGACTATTTTTTTTTTGTAATTTGACGAACCAAAACCAGATTTATCCTGAATTTTAGGAACC 
ACAGATGTAAC^AAACCAATATTTATTTA?TTTCTAAAACAAAATTTCATGGCAGCATGCCTCAGCCCATGAAAAAAACC 
?^ATAAAAATA^r^arArA^TrjArrA??nAAAAaTTCr,TTCTCCCATGGGTAACC A(?ATCAAACTCACATCCAAACATAA 
CATGGATATCTCCTTACCAATCATACTAATTATTTTGGGTTAA ATATTAATCATTATTT'TTaAGATATTAATTAAGAAAT 

taaaagattttt^aaaaaaatgt^taaaattatattattcatgatttttcatacatttgattttgataataaatatattt 
tttttaatttcttaaaaaatgttgcaa gacacttattagacatagtcttgttctgtttacaaaagcattcatcatttaat 

ACATTAAAAAArATTTAATACTAACAGTAGAATCTTCTTGTGAGTGGTGTGGGAGTAGGCAACCTGGCATTGAAACGAGA 

GAAAGAGAGTCAGftACCAGAAGACAAArAAAAACrrftTGCAftCAj^ 

TCAATTGGTTGCTACATTCAATTTTCAACTCAGTCAACGGTTGAGATTCACTCTGACTTCCCCAATCTAAGCCGCGGATG 
CAAACGGTTGAATCTAACCCACAATCCAATC!TCGTrAG TTAGGGGCTTTTCCGTCATTAACTCACCCCTGCCACCCGGTT 
TCCCTA^AAATrGGAAqTCM^GCTCCCCTCTAAACTCGTATCGCTTCAGAGTTGAGACCAAGACACACTCGTTCATATA 
TCTCTCTGCTCTTCTCTTCTCTTCTACCTCTCAA^TACTTTT CTTCTCCCTCTACCAAATCCTAGATTCCGTGGTTCAA 
TTTCGGATCTTGCACTTCTGGTTTGCTTTGCCTTqCTTTTTCCTCAACTGGGTCCATCTAGGATCCATGTGAAACTCTAC 
TCTTTCTTTAATATCTGC^GAATACGCGTTGGACTTTCAGATCTAGTCGAAATCATTTCATAATTGCCTTTCTTTCTTTT 
AGCTTATGAGAAATAAAATCATTTTTT TTTATTTGAAAATAAACCTTGGGCCTTGTGCTGACTGAGATGGGGTTTGGTGA 
^^AGAG A^TTT^AG^GA^ATT^TGTAATTGTACTTn TTTGTCTGTAGTTTTGTTTTGTTTTCTTGTTTCTCATACATTCCT 
TAGGCTTCAATTTTATTCGAGTATAGG TCACAATAGGAATTCAAACTTTGAGCAGGGGAATTAATCCCTTCCTTCAAATC 
CAGTTTGTTTGTATATATGTTTAAAAAJ^TGAAACTTTTGCTTTAAATTCTATTATAACTTTTTTTATGGCAAAAATTTTT 
G qATGTGTC TTTGqTCT GCTSTTnTAAA^T^^ 

AGCCAAGCCCCACTCAACCACCACACCACTCTCTCTGCTCTTCTTCTACCTTTCAAGTTTTTAAAGTATTA 

AaATGGCAGAGACATTCCTATTCACCTCGGAGTCAGTGAACGAGGGACACCCTGATAAGCTCTGCGACCAAATCTCCGAT 



AGATGGCAGAGACAT^CCTATrTACCTCAGAGTCAGTGAACGAGGGACACCCTGACAAGCTCTGCGACCAAATCTCCGAT 

GCTG7CCTCGACGCTTGCCTCGAACAGGACCCAGACAGCAAGGTTGCCTGCGAAACATGCACCAAGACCAACTTGGTCAT 

GCTGTCCTCGACGCTTGCCTTGAAC^ 

GGTCTTCGGAGAGATC 

GGTC TT C GGAG AG AT C 
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OF 


Post Office 
Address 


Post Office Address 

24 FOREST CREEK DRIVE 


City 

HOCKESSIN 


State or Country 
DELAWARE 


Zip Code 
19707 



□ Additional Inventors are being named on separately numbered sheets attached hereto. 



SEQUENCE LISTING 



<110> FALCO, SAVERIO CARL 
LI, ZHONGSEN 

<120> S-ADENOSYL-L-METHIONINE SYNTHETASE PROMOTER AND 

ITS USE IN EXPRESSION OF TRANSGENIC GENES IN PLANTS 

<130> BB1205 US NA 

<140> 
<141> 

<150> 60/113,045 
<151> 1998-12-21 

<160> 16 

<170> Microsoft Office 97 



<210> 
<211> 
<212> 
<213> 



1 

1518 
DNA 

Glycine max 



<400> 1 

agccaagccc 

ttaaagtatt 

ccctgacaag 

cccagacagc 

agagatcacc 

catcggcttc 

cattgagcag 

cgaggaaatc 

agaattgatg 

tcgcaagaac 

gtattacaat 

ccaacatgat 

caagccggtg 

tggccgtttt 

catcgatact 

caccaaggtt 

aagtggacta 

tttgtctgtc 

cattgtgaag 

gaggggtggg 

ccctgacttc 

attccactgc 

actttccacg 

ctgcctttca 

ccattgtgtc 

aaaaaaaaaa 



cactcaacca 
aagatggcag 
ctctgcgacc 
aaggttgcct 
accaaggcca 
gtctcaaacg 
cagagccctg 
ggtgctggag 
ccattgagtc 
ggaacctgcc 
gacaacggtg 
gagactgtga 
atcccggaga 
gtcattggag 
tacggaggat 
gataggagtg 
gccagaaggt 
tttgttgaca 
gagaactttg 
aataacaggt 
acatgggaag 
aatgtgctgg 
tcccttgctc 
tttgtaattt 
tcttaatatc 
aaaaaaaa 



ccacaccact 
agacattcct 
aaatctccga 
gcgaaacatg 
acgttgacta 
atgtgggact 
atattgccca 
accagggtca 
atgttcttgc 
catggttgag 
ccatggttcc 
ccaacgacga 
agtaccttga 
gtcctcacgg 
ggggtgctca 
gtgcttacat 
gcattgtgca 
cctatggcac 
atttcaggcc 
tcttgaagac 
tggtcaagcc 
gagtttttta 
tgtgtttttc 
tttacatgat 
agtatcaaaa 



ctctctgctc 
atttacctca 
tgctgtcctc 
caccaagacc 
cgagaagatc 
tgatgctgac 
gggtgtgcac 
catgtttggc 
aactaaactc 
gcctgatggg 
agttcgtgtc 
aattgcagct 
tgagaagacc 
tgatgctggt 
tggtggtggt 
tgtgagacag 
agtgtcttat 
cgggaagatc 
cggtatgatc 
tgctgcatat 
cctcaagtgg 
gcgttgccct 
tctcgtcgtc 
caactaaaaa 
agaatgttcc 



ttcttctacc 
gagtcagtga 
gacgcttgcc 
aacttggtca 
gtgcgtgaca 
aactgcaagg 
ggccacctta 
tatgccacgg 
ggtgctcgtc 
aaaacccaag 
cacactgtgc 
gacctcaagg 
attttccact 
ctcaccggcc 
gctttctccg 
gctgctaaga 
gccattggtg 
catgataagg 
tccatcaacc 
ggacacttcg 
gagaaggcct 
tataatgtct 
ctcctcctat 
atgtactctc 
aagttaaaaa 



tttcaagttt 
acgagggaca 
ttgaacagga 
tggtcttcgg 
cctgcaggaa 
tccttgtaaa 
ccaaaagacc 
acgaaacccc 
tcaccgaggt 
tgactgttga 
ttatctccac 
agcatgtgat 
tgaacccctc 
gcaagatcat 
ggaaggatcc 
gcattgtggc 
tgcccgagcc 
agattctcaa 
ttgatctcaa 
gcagagagga 
aaggccattc 
attatccata 
tttgtttctc 
tgttttccga 
aaaaaaaaaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1518 



<210> 2 

<211> 2336 

<212> DNA 

<213> Glycine max 



1 



<400> 2 

atcgatagag 

acgatatctg 

atttaaatta 

aattagaaaa 

accaaaccaa 

tgaaaacaat 

ataaatctgc 

ttttcatggt 

ttgcacttca 

ttcaaaactt 

atttttaacc 

actatttttt 

acagatgtaa 

ctcagcccat 

ctcccatggg 

tcatactaat 

taaaagattt 

ttttgataat 

catagtcttg 

ctaacagtag 

gaaagagagt 

gggcaaaggc 

ttgagattca 

acaatccaat 

tccctataaa 

aagacacact 

tttcttctcc 

gtttgctttg 

tctttcttta 

taattgcctt 

aaaccttggg 

ttgtaattgt 

taggcttcaa 

ttaatccctt 

tttaaattct 

gttgtaaatt 

agatggcaga 

tctgcgacca 

aggttgcctg 



acatgttatt 
tttattatga 
gaattttttt 
ttttgaatcc 
tcttatatgt 
ttggtatcaa 
aaaagatttt 
gttggtctga 
actaaacgat 
cctacttact 
cctaaacagt 
ttttgtaatt 
ctaaaccaat 
gaaaaaaacc 
taaccagatc 
tattttgggt 
tttaaaaaaa 
aaatatattt 
ttctgtttac 
aatcttcttg 
cagaaccaga 
tggggttggc 
ctctgacttc 
ctcgttactt 
ttggaactca 
cgttcatata 
ctctaccaaa 
ccttgctttt 
atatctgcgg 
tctttctttt 
ccttgtgctg 
acttgtttgt 
ttttattcga 
ccttcaaatc 
attataactt 
tactgtttag 
gacattccta 
aatctccgat 
cgaaacatgc 



cacaaaccat 
tttcagggcg 
tatcaataaa 
ccgatttctc 
tcttcaaatt 
ttcatataca 
acaaatatct 
ggaggatttg 
ggtcagaatt 
tcgtgcgttc 
aaatttgaag 
tgacgaacca 
atttatttat 
ttataaaaat 
aaactcacat 
taaatattaa 
tgtataaaat 
tttttaattt 
aaaagcattc 
tgagtggtgt 
agacaaataa 
tcaattggtt 
cccaatctaa 
aggggctttt 
atgctcccct 
tctctctgct 
tcctagattc 
tcctcaactg 
aatacgcgtt 
agcttatgag 
actgagatgg 
ctgtagtttt 
gtataggtca 
cagtttgttt 
tttttatggc 
gtactaactc 
ttcacctcgg 
gctgtcctcg 
accaagacca 



aaaatgatgg 
caaaaatgcg 
tattaattta 
ctccttttct 
agaacttgaa 
tgcttagtaa 
ttcagaaaaa 
gcactataga 
ggtggggatt 
ggtaatcggt 
gacaaaaata 
aaaccagatt 
tttctaaaac 
atctacacat 
ccaaacataa 
tcattatttt 
tatattattc 
cttaaaaaat 
atcatttaat 
gggagtaggc 
aaagtatgca 
gctacattca 
gccgcggatg 
ccgtcattaa 
ctaaactcgt 
cttctcttct 
cgtggttcaa 
ggtccatcta 
ggactttcag 
aaataaaatc 
ggtttggtga 
gttttgtttt 
caataggaat 
gtatatatgt 
aaaaattttt 
taggcttgtt 
agtcagtgaa 
acgcttgcct 
acttggtcat 



ctaaaattgg 
agtacttaat 
ttagttttat 
tcgctattca 
attattaatt 
taaaatgcga 
attaataaca 
actctcctac 
ttatattcaa 
aacattagac 
atatttttca 
tatcctgaat 
aaaatttcat 
tgaccattga 
catggatatc 
taagatatta 
atgatttttc 
gttgcaagac 
acattaaaaa 
aacctggcat 
acaaacaaat 
attttcaact 
caaacggttg 
ctcacccctg 
atcgcttcag 
cttctacctc 
tttcggatct 
ggatccatgt 
atctagtcga 
attttttttt 
ttacagaatt 
cttgtttctc 
tcaaactttg 
ttaaaaaatg 
gcatgtgtct 
gtgcagtttt 
cgagggacac 
cgaacaggac 
ggtcttcgga 



tgtgattgga 
aaaattttac 
tagaaatatt 
tcattttcta 
ataattaaac 
taattaattg 
aattttgtcg 
ggaccattct 
gcatatccct 
tttcaaaatc 
aatttgatag 
tttaggaacc 
ggcagcatgc 
aaagttcgtt 
tccttaccaa 
attaagaaat 
atacatttga 
acttattaga 
atatttaata 
tgaaacgaga 
caaaatcaaa 
cagtcaacgg 
aatctaaccc 
ccacccggtt 
agttgagacc 
tcaaggtact 
tgcacttctg 
gaaactctac 
aatcatttca 
atttcaaaat 
ttagcgaatt 
atacattcct 
agcaggggaa 
aaacttttgc 
ttgctctcct 
tgaagtataa 
cctgataagc 
ccagacagca 
gagatc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2336 



<210> 3 

<211> 522 

<212> DNA 

<213> Glycine max 



<220> 

<221> unsure 

<222> (405) 

<220> 

<221> unsure 

<222> (509) 

<220> 

<221> unsure 

<222> (515) 



2 



<400> 3 

gaccaagaca cactcgttca tatatctctc tgctcttctc ttctcttcta cctctcaagt 60 
ttttgaagta taaagatggc agagacattc ctattcacct cggagtcagt gaacgaggga 120 
caccctgata agctctgcga ccaaatctcc gatgctgtcc tcgacgcttg cctcgaacag 180 
gacccagaca gcaaggttgc ctgcgaaaca tgcaccaaga ccaacttggt catggtcttc 240 
ggagagatca ccaccaaggc caacgttgac tacgagaaga tcgtgcgtga cacctgcagg 300 
agcatcggct tcatctcaaa cgatgtggga cttgatgctg acaactgcaa ggtccttgta 360 
aacattgagc agcagagccc tgatattgcc cagggcgtgc acggncacct taccaaaaga 420 
cctgaagaaa ttggcgctgg tgaccaaggt cacatgtttg gctatgccac tgatgaaacc 480 
ccaaaattca tgccattgag tcatgttcnt gcaancaagc tc 522 

<210> 4 

<211> 32 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR Primer 

<400> 4 

catgccatgg ctttatactt caaaaactgc ac 32 

<210> 5 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR Primer 

<400> 5 

gctctagatc aaactcacat ccaa 24 

<210> 6 

<211> 1314 

<212> DNA 

<213> Glycine max 

<400> 6 

tctagatcaa actcacatcc aaacataaca tggatatctc cttaccaatc atactaatta 60 

ttttgggtta aatattaatc attattttta agatattaat taagaaatta aaagattttt 120 

taaaaaaatg tataaaatta tattattcat gatttttcat acatttgatt ttgataataa 180 

atatattttt tttaatttct taaaaaatgt tgcaagacac ttattagaca tagtcttgtt 240 

ctgtttacaa aagcattcat catttaatac attaaaaaat atttaatact aacagtagaa 300 

tcttcttgtg agtggtgtgg gagtaggcaa cctggcattg aaacgagaga aagagagtca 360 

gaaccagaag acaaataaaa agtatgcaac aaacaaatca aaatcaaagg gcaaaggctg 420 

gggttggctc aattggttgc tacattcaat tttcaactca gtcaacggtt gagattcact 480 

ctgacttccc caatctaagc cgcggatgca aacggttgaa tctaacccac aatccaatct 540 

cgttacttag gggcttttcc gtcattaact cacccctgcc acccggtttc cctataaatt 600 

ggaactcaat gctcccctct aaactcgtat cgcttcagag ttgagaccaa gacacactcg 660 

ttcatatatc tctctgctct tctcttctct tctacctctc aaggtacttt tcttctccct 720 

ctaccaaatc ctagattccg tggttcaatt tcggatcttg cacttctggt ttgctttgcc 780 

ttgctttttc ctcaactggg tccatctagg atccatgtga aactctactc tttctttaat 840 

atctgcggaa tacgcgttgg actttcagat ctagtcgaaa tcatttcata attgcctttc 900 

tttcttttag cttatgagaa ataaaatcat ttttttttat ttcaaaataa accttgggcc 960 

ttgtgctgac tgagatgggg tttggtgatt acagaatttt agcgaatttt gtaattgtac 1020 

ttgtttgtct gtagttttgt tttgttttct tgtttctcat acattcctta ggcttcaatt 1080 

ttattcgagt ataggtcaca ataggaattc aaactttgag caggggaatt aatcccttcc 114 0 

ttcaaatcca gtttgtttgt atatatgttt aaaaaatgaa acttttgctt taaattctat 1200 



3 



tataactttt tttatggcaa aaatttttgc atgtgtcttt gctctcctgt tgtaaattta 1260 
ctgtttaggt actaactcta ggcttgttgt gcagtttttg aagtataacc atgg 1314 

<210> 7 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR Primer 

<400> 7 

ttcgagtata ggtcacaata gg 22 

<210> 8 

<211> 19 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR Primer 

<400> 8 

cttcgctgag gacatggac 19 

<210> 9 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR Primer 

<400> 9 

gagttgtcgc tgttgttcga c 21 

<210> 10 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR Primer 

<400> 10 

aacacagcat ccgcattgcg 20 

<210> 11 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR Primer 

<400> 11 

aggagtgcag aatcagatca g 21 

<210> 12 
<211> 20 



4 



<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: PCR Primer 



<400> 12 

gctgatcgaa ccagatggag 



20 



<210> 13 

<211> 23 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR Primer 



<400> 13 

ctgtacagtt aaacagtagt tct 



23 



<210> 14 

<211> 2165 

<212> DNA 

<213> Glycine max 



<400> 14 

atcgatagag 

acgatatctg 

atttaaatta 

aattagaaaa 

accaaaccaa 

tgaaaacaat 

ataaatctgc 

ttttcatggt 

ttgcacttca 

ttcaaaactt 

atttttaacc 

actatttttt 

acagatgtaa 

ctcagcccat 

ctcccatggg 

tcatactaat 

taaaagattt 

ttttgataat 

catagtcttg 

ctaacagtag 

gaaagagagt 

gggcaaaggc 

ttgagattca 

acaatccaat 

tccctataaa 

aagacacact 

tttcttctcc 

gtttgctttg 

tctttcttta 

taattgcctt 

aaaccttggg 

ttgtaattgt 

taggcttcaa 

ttaatccctt 



acatgttatt 
tttattatga 
gaattttttt 
ttttgaatcc 
tcttatatgt 
ttggtatcaa 
aaaagatttt 
gttggtctga 
actaaacgat 
cctacttact 
cctaaacagt 
ttttgtaatt 
ctaaaccaat 
gaaaaaaacc 
taaccagatc 
tattttgggt 
tttaaaaaaa 
aaatatattt 
ttctgtttac 
aatcttcttg 
cagaaccaga 
tggggttggc 
ctctgacttc 
ctcgttactt 
ttggaactca 
cgttcatata 
ctctaccaaa 
ccttgctttt 
atatctgcgg 
tctttctttt 
ccttgtgctg 
acttgtttgt 
ttttattcga 
ccttcaaatc 



cacaaaccat 
tttcagggcg 
tatcaataaa 
ccgatttctc 
tcttcaaatt 
ttcatataca 
acaaatatct 
ggaggatttg 
ggtcagaatt 
tcgtgcgttc 
aaatttgaag 
tgacgaacca 
atttatttat 
ttataaaaat 
aaactcacat 
taaatattaa 
tgtataaaat 
tttttaattt 
aaaagcattc 
tgagtggtgt 
agacaaataa 
tcaattggtt 
cccaatctaa 
aggggctttt 
atgctcccct 
tctctctgct 
tcctagattc 
tcctcaactg 
aatacgcgtt 
agcttatgag 
actgagatgg 
ctgtagtttt 
gtataggtca 
cagtttgttt 



aaaatgatgg 
caaaaatgcg 
tattaattta 
ctccttttct 
agaacttgaa 
tgcttagtaa 
ttcagaaaaa 
gcactataga 
ggtggggatt 
ggtaatcggt 
gacaaaaata 
aaaccagatt 
tttctaaaac 
atctacacat 
ccaaacataa 
tcattatttt 
tatattattc 
cttaaaaaat 
atcatttaat 
gggagtaggc 
aaagtatgca 
gctacattca 
gccgcggatg 
ccgtcattaa 
ctaaactcgt 
cttctcttct 
cgtggttcaa 
ggtccatcta 
ggactttcag 
aaataaaatc 
ggtttggtga 
gttttgtttt 
caataggaat 
gtatatatgt 



ctaaaattgg 
agtacttaat 
ttagttttat 
tcgctattca 
attattaatt 
taaaatgcga 
attaataaca 
actctcctac 
ttatattcaa 
aacattagac 
atatttttca 
tatcctgaat 
aaaatttcat 
tgaccattga 
catggatatc 
taagatatta 
atgatttttc 
gttgcaagac 
acattaaaaa 
aacctggcat 
acaaacaaat 
attttcaact 
caaacggttg 
ctcacccctg 
atcgcttcag 
cttctacctc 
tttcggatct 
ggatccatgt 
atctagtcga 
attttttttt 
ttacagaatt 
cttgtttctc 
tcaaactttg 
ttaaaaaatg 



tgtgattgga 
aaaattttac 
tagaaatatt 
tcattttcta 
ataattaaac 
taattaattg 
aattttgtcg 
ggaccattct 
gcatatccct 
tttcaaaatc 
aatttgatag 
tttaggaacc 
ggcagcatgc 
aaagttcgtt 
tccttaccaa 
attaagaaat 
atacatttga 
acttattaga 
atatttaata 
tgaaacgaga 
caaaatcaaa 
cagtcaacgg 
aatctaaccc 
ccacccggtt 
agttgagacc 
tcaaggtact 
tgcacttctg 
gaaactctac 
aatcatttca 
atttcaaaat 
ttagcgaatt 
atacattcct 
agcaggggaa 
aaacttttgc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



5 



tttaaattct attataactt tttttatggc 
gttgtaaatt tactgtttag gtactaactc 
agatg 



aaaaattttt gcatgtgtct ttgctctcct 2100 
taggcttgtt gtgcagtttt tgaagtataa 2160 

2165 



<210> 15 

<211> 1574 

<212> DNA 

<213> Glycine max 



<400> 15 

atcgatagag acatgttatt cacaaaccat 
acgatatctg tttattatga tttcagggcg 
atttaaatta gaattttttt tatcaataaa 
aattagaaaa ttttgaatcc ccgatttctc 
accaaaccaa tcttatatgt tcttcaaatt 
tgaaaacaat ttggtatcaa ttcatataca 
ataaatctgc aaaagatttt acaaatatct 
ttttcatggt gttggtctga ggaggatttg 
ttgcacttca actaaacgat ggtcagaatt 
ttcaaaactt cctacttact tcgtgcgttc 
atttttaacc cctaaacagt aaatttgaag 
actatttttt ttttgtaatt tgacgaacca 
acagatgtaa ctaaaccaat atttatttat 
ctcagcccat gaaaaaaacc ttataaaaat 
ctcccatggg taaccagatc aaactcacat 
tcatactaat tattttgggt taaatattaa 
taaaagattt tttaaaaaaa tgtataaaat 
ttttgataat aaatatattt tttttaattt 
catagtcttg ttctgtttac aaaagcattc 
ctaacagtag aatcttcttg tgagtggtgt 
gaaagagagt cagaaccaga agacaaataa 
gggcaaaggc tggggttggc tcaattggtt 
ttgagattca ctctgacttc cccaatctaa 
acaatccaat ctcgttactt aggggctttt 
tccctataaa ttggaactca atgctcccct 
aagacacact cgttcatata tctctctgct 
gaagtataaa gatg 



aaaatgatgg ctaaaattgg tgtgattgga 60 

caaaaatgcg agtacttaat aaaattttac 120 

tattaattta ttagttttat tagaaatatt 180 

ctccttttct tcgctattca tcattttcta 240 

agaacttgaa attattaatt ataattaaac 300 

tgcttagtaa taaaatgcga taattaattg 360 

ttcagaaaaa attaataaca aattttgtcg 420 

gcactataga actctcctac ggaccattct 4 80 

ggtggggatt ttatattcaa gcatatccct 54 0 

ggtaatcggt aacattagac tttcaaaatc 600 

gacaaaaata atatttttca aatttgatag 660 

aaaccagatt tatcctgaat tttaggaacc 720 

tttctaaaac aaaatttcat ggcagcatgc 780 

atctacacat tgaccattga aaagttcgtt 840 

ccaaacataa catggatatc tccttaccaa 900 

tcattatttt taagatatta attaagaaat 960 

tatattattc atgatttttc atacatttga 1020 

cttaaaaaat gttgcaagac acttattaga 1080 

atcatttaat acattaaaaa atatttaata 1140 

gggagtaggc aacctggcat tgaaacgaga 1200 

aaagtatgca acaaacaaat caaaatcaaa 1260 

gctacattca attttcaact cagtcaacgg 1320 

gccgcggatg caaacggttg aatctaaccc 1380 

ccgtcattaa ctcacccctg ccacccggtt 1440 

ctaaactcgt atcgcttcag agttgagacc 1500 

cttctcttct cttctacctc tcaagttttt 1560 

1574 



<210> 16 

<211> 719 

<212> DNA 

<213> Glycine max 



<400> 16 

agatcaaact 

tgggttaaat 

aaaaatgtat 

tatttttttt 

tttacaaaag 

tcttgtgagt 

ccagaagaca 

ttggctcaat 

acttccccaa 

tacttagggg 

actcaatgct 

atatatctct 



cacatccaaa 
attaatcatt 
aaaattatat 
aatttcttaa 
cattcatcat 
ggtgtgggag 
aataaaaagt 
tggttgctac 
tctaagccgc 
cttttccgtc 
cccctctaaa 
ctgctcttct 



cataacatgg 
atttttaaga 
tattcatgat 
aaaatgttgc 
ttaatacatt 
taggcaacct 
atgcaacaaa 
attcaatttt 
ggatgcaaac 
attaactcac 
ctcgtatcgc 
cttctcttct 



atatctcctt 
tattaattaa 
ttttcataca 
aagacactta 
aaaaaatatt 
ggcattgaaa 
caaatcaaaa 
caactcagtc 
ggttgaatct 
ccctgccacc 
ttcagagttg 
acctctcaag 



accaatcata 
gaaattaaaa 
tttgattttg 
ttagacatag 
taatactaac 
cgagagaaag 
tcaaagggca 
aacggttgag 
aacccacaat 
cggtttccct 
agaccaagac 
tttttgaagt 



ctaattattt 
gattttttaa 
ataataaata 
tcttgttctg 
agtagaatct 
agagtcagaa 
aaggctgggg 
attcactctg 
ccaatctcgt 
ataaattgga 
acactcgttc 
ataaagatg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
719 



<210> 17 
<211> 6975 
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<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial 
<400> 17 

gaatatgcat cactagtaag ctttgctcta 
gagcttaaca gcacagttgc tcctctcaga 
aactactacg ttgtgtataa cggtccacat 
aaggcggcaa caaacggcgt tcccggagtt 
gcaagagcag cagctgacgc gtacacaaca 
cccaaaggag aagctcaact caagcccaag 
aagcaaaaag cccactggct cacgctagga 
aaagagatct cctttgcccc ggagattaca 
ggaaggaagt tcgaaggtga aggtgacgac 
agcctcttca atttcagaaa gaatgctgac 
ggtctcatca agacgatcta cccgagtaac 
aaggttaaag atgcagtcaa aagattcagg 
atatttctca agatcagaag tactattcca 
ccaaggcaag taatagagat tggagtctct 
atgcatggag tctaagattc aaatcgagga 
acagttcata cagagtcttt tacgactcaa 
ggagcacgac actctggtct actccaaaaa 
ggctattgag acttttcaac aaaggataat 
agctatctgt cacttcatcg aaaggacagt 
tcattgcgat aaaggaaagg ctatcattca 
tggaccccca cccacgagga gcatcgtgga 
gcaagtggat tgatgtgaca tctccactga 
ttcgcaagac ccttcctcta tataaggaag 
ctcatttctc tattacttca gccataacaa 
ggtacgtcct gtagaaaccc caacccgtga 
cagtctggat cgcgaaaact gtggaattga 
aagccgggca attgctgtgc caggcagttt 
taattatgcg ggcaacgtct ggtatcagcg 
ccagcgtatc gtgctgcgtt tcgatgcggt 
tcaggaagtg atggagcatc agggcggcta 
tgttattgcc gggaaaagtg tacgtatcac 
gactatcccg ccgggaatgg tgattaccga 
ccatgatttc tttaactatg ccggaatcca 
cacctgggtg gacgatatca ccgtggtgac 
tgttgactgg caggtggtgg ccaatggtga 
acaggtggtt gcaactggac aaggcactag 
ctggcaaccg ggtgaaggtt atctctatga 
gtgtgatatc tacccgcttc gcgtcggcat 
cctgattaac cacaaaccgt tctactttac 
acgtggcaaa ggattcgata acgtgctgat 
tggggccaac tcctaccgta cctcgcatta 
agatgaacat ggcatcgtgg tgattgatga 
aggcattggt ttcgaagcgg gcaacaagcc 
cggggaaact cagcaagcgc acttacaggc 
ccacccaagc gtggtgatgt ggagtattgc 
cgggaatatt tcgccactgg cggaagcaac 
ctgcgtcaat gtaatgttct gcgacgctca 
gctgtgcctg aaccgttatt acggatggta 
gaaggtactg gaaaaagaac ttctggcctg 
caccgaatac ggcgtggata cgttagccgg 
tgaagagtat cagtgtgcat ggctggatat 
cgtcgtcggt gaacaggtat ggaatttcgc 
cgttggcggt aacaagaaag ggatcttcac 



Sequence : plasmid 



gaggatccaa ttccaatccc acaaaaatct 60 
gcagaatcgg gtattcaaca ccctcatatc 120 
gccggtatat acgatgactg gggttgtaca 180 
gcacacaaga aatttgccac tattacagag 240 
agtcagcaaa cagacaggtt gaacttcatc 300 
agctttgcta aggccctaac aagcccacca 360 
accaaaaggc ccagcagtga tccagcccca 420 
atggacgatt tcctctatct ttacgatcta 480 
actatgttca ccactgataa tgagaaggtt 540 
ccacagatgg ttagagaggc ctacgcagca 600 
aatctccagg agatcaaata ccttcccaag 660 
actaattgca tcaagaacac agagaaagac 720 
gtatggacga ttcaaggctt gcttcataaa 780 
aaaaaggtag ttcctactga atctaaggcc 840 
tctaacagaa ctcgccgtga agactggcga 900 
tgacaagaag aaaatcttcg tcaacatggt 960 
tgtcaaagat acagtctcag aagaccaaag 1020 
ttcgggaaac ctcctcggat tccattgccc 1080 
agaaaaggaa ggtggctcct acaaatgcca 1140 
agatgcctct gccgacagtg gtcccaaaga 1200 
aaaagaagac gttccaacca cgtcttcaaa 1260 
cgtaagggat gacgcacaat cccactatcc 1320 
ttcatttcat ttggagagga cacgctcgag 1380 
aagaactctt ttctcttctt attaaaccat 1440 
aatcaaaaaa ctcgacggcc tgtgggcatt 1500 
tcagcgttgg tgggaaagcg cgttacaaga 1560 
taacgatcag ttcgccgatg cagatattcg 1620 
cgaagtcttt ataccgaaag gttgggcagg 1680 
cactcattac ggcaaagtgt gggtcaataa 1740 
tacgccattt gaagccgatg tcacgccgta 1800 
cgtttgtgtg aacaacgaac tgaactggca 1860 
cgaaaacggc aagaaaaagc agtcttactt 1920 
tcgcagcgta atgctctaca ccacgccgaa 1980 
gcatgtcgcg caagactgta accacgcgtc 2040 
tgtcagcgtt gaactgcgtg atgcggatca 2100 
cgggactttg caagtggtga atccgcacct 2160 
actgtgcgtc acagccaaaa gccagacaga 2220 
ccggtcagtg gcagtgaagg gccaacagtt 2280 
tggctttggt cgtcatgaag atgcggactt 2340 
ggtgcacgac cacgcattaa tggactggat 2400 
cccttacgct gaagagatgc tcgactgggc 24 60 
aactgctgct gtcggcttta acctctcttt 2520 
gaaagaactg tacagcgaag aggcagtcaa 2580 
gattaaagag ctgatagcgc gtgacaaaaa 2640 
caacgaaccg gatacccgtc cgcaagtgca 2700 
gcgtaaactc gacccgacgc gtccgatcac 2760 
caccgatacc atcagcgatc tctttgatgt 2820 
tgtccaaagc ggcgatttgg aaacggcaga 2880 
gcaggagaaa ctgcatcagc cgattatcat 2940 
gctgcactca atgtacaccg acatgtggag 3000 
gtatcaccgc gtctttgatc gcgtcagcgc 3060 
cgattttgcg acctcgcaag gcatattgcg 3120 
tcgcgaccgc aaaccgaagt cggcggcttt 3180 
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tctgctgcaa aaacgctgga ctggcatgaa 
acaatgaatc aacaactctc ctggcgcacc 
cccgggggta cctaatagtg agatccaaca 
accacgnacg ccggaaggtt gccgcagcgt 
aatgcaatga tgaatatgat actgactatg 
cacctcataa cgtgcatcat gcatgccctg 
aattatgctc gttggaggat gtcgcggcaa 
tcacgcatgc attcatcaat attattcatg 
aatcatccag cgtgattggt aacttcagtt 
acgttttcaa taaggacgag atggtggagt 
aaacatttgg caataaagtt tcttaagatt 
catataattt ctgttgaatt acgttaagca 
atttatgaga tgggttttta tgattagagt 
aaacaaaata tagcgcgcaa actaggataa 
agatcgatca aacttcggta ctgtgtaatg 
aaggtacatc ggtcgacgag ctccctatag 
ttcgtaatca tggtcatagc tgtttcctgt 
caacatacga gccggaagca taaagtgtaa 
cacattaatt gcgttgcgct cactgcccgc 
gcattaatga atcggccaac gcgcggggag 
ttcctcgctc actgactcgc tgcgctcggt 
ctcaaaggcg gtaatacggt tatccacaga 
agcaaaaggc cagcaaaagg ccaggaaccg 
taggctccgc ccccctgacg agcatcacaa 
cccgacagga ctataaagat accaggcgtt 
tgttccgacc ctgccgctta ccggatacct 
gctttctcat agctcacgct gtaggtatct 
gggctgtgtg cacgaacccc ccgttcagcc 
tcttgagtcc aacccggtaa gacacgactt 
gattagcaga gcgaggtatg taggcggtgc 
cggctacact agaaggacag tatttggtat 
aaaaagagtt ggtagctctt gatccggcaa 
tgtttgcaag cagcagatta cgcgcagaaa 
ttctacgggg tctgacgctc agtggaacga 
attatcaaaa aggatcttca cctagatcct 
ctaaagtata tatgagtaaa cttggtctga 
tatctcagcg atctgtctat ttcgttcatc 
aactacgata cgggagggct taccatctgg 
acgctcaccg gctccagatt tatcagcaat 
aagtggtcct gcaactttat ccgcctccat 
agtaagtagt tcgccagtta atagtttgcg 
ggtgtcacgc tcgtcgtttg gtatggcttc 
agttacatga tcccccatgt tgtgcaaaaa 
tgtcagaagt aagttggccg cagtgttatc 
tcttactgtc atgccatccg taagatgctt 
attctgagaa tagtgtatgc ggcgaccgag 
taccgcgcca catagcagaa ctttaaaagt 
aaaactctca aggatcttac cgctgttgag 
caactgatct tcagcatctt ttactttcac 
gcaaaatgcc gcaaaaaagg gaataagggc 
cctttttcaa tattattgaa gcatttatca 
tgaatgtatt tagaaaaata aacaaatagg 
acctgacgcg ccctgtagcg gcgcattaag 
gaccgctaca cttgccagcg ccctagcgcc 
cgccacgttc gccggctttc cccgtcaagc 
atttagtgct ttacggcacc tcgaccccaa 
tgggccatcg ccctgataga cggtttttcg 
tagtggactc ttgttccaaa ctggaacaac 
tttataaggg attttgccga tttcggccta 
atttaacgcg aattttaaca aaatattaac 



cttcggtgaa aaaccgcagc agggaggcaa 3240 
atcgtcggct acagcctcgg tggggaattc 3300 
cttacgtttg caacgtccaa gagcaaatag 3360 
gtggattgcg tctcaattct ctcttgcagg 3420 
aaactttgag ggaatactgc ctagcaccgt 3480 
acaacatgga acatcgctat ttttctgaag 3540 
ttgcagctat tgccaacatc gaactacccc 3600 
cggggaaagg caagattaat ccaactggca 3660 
ccagcgactt gattcgtttt ggtgctaccc 3720 
aaagaaggag tgcgtcgaag cagatcgttc 3780 
gaatcctgtt gccggtcttg cgatgattat 3840 
tgtaataatt aacatgtaat gcatgacgtt 3900 
cccgcaatta tacatttaat acgcgataga 3960 
attatcgcgc gcggtgtcat ctatgttact 4020 
acgatgagca atcgagaggc tgactaacaa 4080 
tgagtcgtat tagaggccga cttggccaaa 4140 
gtgaaattgt tatccgctca caattccaca 4200 
agcctggggt gcctaatgag tgagctaact 42 60 
tttccagtcg ggaaacctgt cgtgccagct 4320 
aggcggtttg cgtattgggc gctcttccgc 4380 
cgttcggctg cggcgagcgg tatcagctca 44 40 
atcaggggat aacgcaggaa agaacatgtg 4500 
taaaaaggcc gcgttgctgg cgtttttcca 4560 
aaatcgacgc tcaagtcaga ggtggcgaaa 4 620 
tccccctgga agctccctcg tgcgctctcc 4 680 
gtccgccttt ctcccttcgg gaagcgtggc 474 0 
cagttcggtg taggtcgttc gctccaagct 4800 
cgaccgctgc gccttatccg gtaactatcg 48 60 
atcgccactg gcagcagcca ctggtaacag 4 920 
tacagagttc ttgaagtggt ggcctaacta 4 980 
ctgcgctctg ctgaagccag ttaccttcgg 5040 
acaaaccacc gctggtagcg gtggtttttt 5100 
aaaaggatct caagaagatc ctttgatctt 5160 
aaactcacgt taagggattt tggtcatgag 5220 
tttaaattaa aaatgaagtt ttaaatcaat 5280 
cagttaccaa tgcttaatca gtgaggcacc 5340 
catagttgcc tgactccccg tcgtgtagat 5400 
ccccagtgct gcaatgatac cgcgagaccc 54 60 
aaaccagcca gccggaaggg ccgagcgcag 5520 
ccagtctatt aattgttgcc gggaagctag 5580 
caacgttgtt gccattgcta caggcatcgt 5640 
attcagctcc ggttcccaac gatcaaggcg 5700 
agcggttagc tccttcggtc ctccgatcgt 5760 
actcatggtt atggcagcac tgcataattc 5820 
ttctgtgact ggtgagtact caaccaagtc 5880 
ttgctcttgc ccggcgtcaa tacgggataa 5940 
gctcatcatt ggaaaacgtt cttcggggcg 6000 
atccagttcg atgtaaccca ctcgtgcacc 6060 
cagcgtttct gggtgagcaa aaacaggaag 6120 
gacacggaaa tgttgaatac tcatactctt 6180 
gggttattgt ctcatgagcg gatacatatt 624 0 
ggttccgcgc acatttcccc gaaaagtgcc 6300 
cgcggcgggt gtggtggtta cgcgcagcgt 6360 
cgctcctttc gctttcttcc cttcctttct 6420 
tctaaatcgg ggcatccctt tagggttccg 64 80 
aaaacttgat tagggtgatg gttcacgtag 6540 
ccctttgacg ttggagtcca cgttctttaa 6600 
actcaaccct atctcggtct attcttttga 6660 
ttggttaaaa aatgagctga tttaacaaaa 6720 
aaaatattaa cgtttacaat ttcccattcg 6780 
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ccattcaggc tgcgcaactg ttgggaaggg cgatcggtgc gggcctcttc gctattacgc 6840 
cagctggcga aagggggatg tgctgcaagg cgattaagtt gggtaacgcc agggttttcc 6900 
cagtcacgac gttgtaaaac gacggccagt gccaagctga cttggtcagc ggccgcagat 6960 
ttaggtgaca ctata 6975 

<210> 18 

<211> 3985 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : chimeric gene 



<400> 18 

aagctttgct 

atactaatta 

aaagattttt 

ttgataataa 

tagtcttgtt 

aacagtagaa 

aagagagtca 

gcaaaggctg 

gagattcact 

aatccaatct 

cctataaatt 

gacacactcg 

tcttctccct 

ttgctttgcc 

tttctttaat 

attgcctttc 

aaccttgggc 

tgtaattgta 

aggcttcaat 

taatcccttc 

ttaaattcta 

tgtaaattta 

atggtacgtc 

ttcagtctgg 

gaaagccggg 

cgtaattatg 

ggccagcgta 

aatcaggaag 

tatgttattg 

cagactatcc 

ttccatgatt 

aacacctggg 

tctgttgact 

caacaggtgg 

ctctggcaac 

gagtgtgata 

ttcctgatta 

ttacgtggca 

attggggcca 

gcagatgaac 

ttaggcattg 

aacggggaaa 

aaccacccaa 

cacgggaata 

acctgcgtca 

gtgctgtgcc 



ctagatcaaa 
ttttgggtta 
taaaaaaatg 
atatattttt 
ctgtttacaa 
tcttcttgtg 
gaaccagaag 
gggttggctc 
ctgacttccc 
cgttacttag 
ggaactcaat 
ttcatatatc 
ctaccaaatc 
ttgctttttc 
atctgcggaa 
tttcttttag 
cttgtgctga 
cttgtttgtc 
tttattcgag 
cttcaaatcc 
ttataacttt 
ctgtttaggt 
ctgtagaaac 
atcgcgaaaa 
caattgctgt 
cgggcaacgt 
tcgtgctgcg 
tgatggagca 
ccgggaaaag 
cgccgggaat 
tctttaacta 
tggacgatat 
ggcaggtggt 
ttgcaactgg 
cgggtgaagg 
tctacccgct 
accacaaacc 
aaggattcga 
actcctaccg 
atggcatcgt 
gtttcgaagc 
ctcagcaagc 
gcgtggtgat 
tttcgccact 
atgtaatgtt 
tgaaccgtta 



ctcacatcca 
aatattaatc 
tataaaatta 
tttaatttct 
aagcattcat 
agtggtgtgg 
acaaataaaa 
aattggttgc 
caatctaagc 
gggcttttcc 
gctcccctct 
tctctgctct 
ctagattccg 
ctcaactggg 
tacgcgttgg 
cttatgagaa 
ctgagatggg 
tgtagttttg 
tataggtcac 
agtttgtttg 
ttttatggct 
actaactcta 
cccaacccgt 
ctgtggaatt 
gccaggcagt 
ctggtatcag 
tttcgatgcg 
tcagggcggc 
tgtacgtatc 
ggtgattacc 
tgccggaatc 
caccgtggtg 
ggccaatggt 
acaaggcact 
ttatctctat 
tcgcgtcggc 
gttctacttt 
taacgtgctg 
tacctcgcat 
ggtgattgat 
gggcaacaag 
gcacttacag 
gtggagtatt 
ggcggaagca 
ctgcgacgct 
ttacggatgg 



aacataacat 
attattttta 
tattattcat 
taaaaaatgt 
catttaatac 
gagtaggcaa 
agtatgcaac 
tacattcaat 
cgcggatgca 
gtcattaact 
aaactcgtat 
tctcttctct 
tggttcaatt 
tccatctagg 
actttcagat 
ataaaatcac 
gtttggtgat 
ttttgttttc 
aataggaatt 
tatatatgtt 
gaaatttttg 
ggcttgttgt 
gaaatcaaaa 
gatcagcgtt 
tttaacgatc 
cgcgaagtct 
gtcactcatt 
tatacgccat 
accgtttgtg 
gacgaaaacg 
catcgcagcg 
acgcatgtcg 
gatgtcagcg 
agcgggactt 
gaactgtgcg 
atccggtcag 
actggctttg 
atggtgcacg 
tacccttacg 
gaaactgctg 
ccgaaagaac 
gcgattaaag 
gccaacgaac 
acgcgtaaac 
cacaccgata 
tatgtccaaa 



ggatatcttc 
agatattaat 
gatttttcat 
tgcaagacac 
attaaaaaat 
cctggcattg 
aaacaaatca 
tttcaactca 
aacggttgaa 
cacccctgcc 
cgcttcagag 
tctacctctc 
tcggatcttg 
atccatgtga 
ctagtcgaaa 
ttttttttta 
tacagaattt 
ttgtttctca 
caaactttga 
taaaaaatga 
catgtgtctt 
gcagtttttg 
aactcgacgg 
ggtgggaaag 
agttcgccga 
ttataccgaa 
acggcaaagt 
ttgaagccga 
tgaacaacga 
gcaagaaaaa 
taatgctcta 
cgcaagactg 
ttgaactgcg 
tgcaagtggt 
tcacagccaa 
tggcagtgaa 
gtcgtcatga 
accacgcatt 
ctgaagagat 
ctgtcggctt 
tgtacagcga 
agctgatagc 
cggatacccg 
tcgacccgac 
ccatcagcga 
gcggcgattt 



cttaccaatc 
taagaaatta 
acatttgatt 
ttattagaca 
atttaatact 
aaacgagaga 
aaatcaaagg 
gtcaacggtt 
tctaacccac 
acccggtttc 
ttgagaccaa 
aaggtacttt 
cacttctggt 
aactctactc 
tcatttcata 
tttcaaaata 
tagcgaattt 
tacattcctt 
gcaggggaat 
aacttttgct 
tgctctctgt 
aagtataacc 
cctgtgggca 
cgcgttacaa 
tgcagatatt 
aggttgggca 
gtgggtcaat 
tgtcacgccg 
actgaactgg 
gcagtcttac 
caccacgccg 
taaccacgcg 
tgatgcggat 
gaatccgcac 
aagccagaca 
gggccaacag 
agatgcggac 
aatggactgg 
gctcgactgg 
taacctctct 
agaggcagtc 
gcgtgacaaa 
tccgcaagtg 
gcgtccgatc 
tctctttgat 
ggaaacggca 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
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gagaaggtac 
atcaccgaat 
agtgaagagt 
gccgtcgtcg 
cgcgttggcg 
tttctgctgc 
aaacaatgaa 
tccccggggg 
agaccacgna 
ggaatgcaat 
gtcacctcat 
agaattatgc 
cctcacgcat 
caaatcatcc 
ccacgttttc 
tcaaacattt 
atcatataat 
ttatttatga 
gaaaacaaaa 
ctagatcgat 
aaaaggtaca 



tggaaaaaga 
acggcgtgga 
atcagtgtgc 
gtgaacaggt 
gtaacaagaa 
aaaaacgctg 
tcaacaactc 
tacctaatag 
cgccggaagg 
gatgaatatg 
aacgtgcatc 
tcgttggagg 
gcattcatca 
agcgtgattg 
aataaggacg 
ggcaataaag 
ttctgttgaa 
gatgggtttt 
tatagcgcgc 
caaacttcgg 
tcggtcgacg 



acttctggcc 
tacgttagcc 
atggctggat 
atggaatttc 
agggatcttc 
gactggcatg 
tcctggcgca 
tgagatccaa 
ttgccgcagc 
atactgacta 
atgcatgccc 
atgtcgcggc 
atattattca 
gtaacttcag 
agatggtgga 
tttcttaaga 
ttacgttaag 
tatgattaga 
aaactaggat 
tactgtgtaa 
agctc 



tggcaggaga 
gggctgcact 
atgtatcacc 
gccgattttg 
actcgcgacc 
aacttcggtg 
ccatcgtcgg 
cacttacgtt 
gtgtggattg 
tgaaactttg 
tgacaacatg 
aattgcagct 
tgcggggaaa 
ttccagcgac 
gtaaagaagg 
ttgaatcctg 
catgtaataa 
gtcccgcaat 
aaattatcgc 
tgacgatgag 



aactgcatca 
caatgtacac 
gcgtctttga 
cgacctcgca 
gcaaaccgaa 
aaaaaccgca 
ctacagcctc 
tgcaacgtcc 
cgtctcaatt 
agggaatact 
gaacatcgct 
attgccaaca 
ggcaagatta 
ttgattcgtt 
agtgcgtcga 
ttgccggtct 
ttaacatgta 
tatacattta 
gcgcggtgtc 
caatcgagag 



gccgattatc 
cgacatgtgg 
tcgcgtcagc 
aggcatattg 
gtcggcggct 
gcagggaggc 
ggtggggaat 
aagagcaaat 
ctctcttgca 
gcctagcacc 
atttttctga 
tcgaactacc 
atccaactgg 
ttggtgctac 
agcagatcgt 
tgcgatgatt 
atgcatgacg 
atacgcgata 
atctatgtta 
gctgactaac 



2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
3985 



<210> 19 

<211> 3684 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : chimeric gene 



<400> 19 

aagctttgct 

atactaatta 

aaagattttt 

ttgataataa 

tagtcttgtt 

aacagtagaa 

aagagagtca 

gcaaaggctg 

gagattcact 

aatccaatct 

cctataaatt 

gacacactcg 

tcttctccct 

ttgctttgcc 

tttctttaat 

attgcctttc 

aaccttgggc 

tgtaattgta 

aggcttcaat 

taatcccttc 

ttaaattcta 

tgtaaattta 

atggccactt 

acccatttcg 

gttggggtgc 

gtggtgacgg 

aagctctcaa 

ctgaaaggct 

ctcgatgatg 



ctagatcaaa 
ttttgggtta 
taaaaaaatg 
atatattttt 
ctgtttacaa 
tcttcttgtg 
gaaccagaag 
gggttggctc 
ctgacttccc 
cgttacttag 
ggaactcaat 
ttcatatatc 
ctaccaaatc 
ttgctttttc 
atctgcggaa 
tttcttttag 
cttgtgctga 
cttgtttgtc 
tttattcgag 
cttcaaatcc 
ttataacttt 
ctgtttaggt 
tcttcgccca 
cccctgcccc 
gagtttcgaa 
attttgagag 
ggattgacct 
tcatgagaga 
ggtcggtcgt 



ctcacatcca 
aatattaatc 
tataaaatta 
tttaatttct 
aagcattcat 
agtggtgtgg 
acaaataaaa 
aattggttgc 
caatctaagc 
gggcttttcc 
gctcccctct 
tctctgctct 
ctagattccg 
ctcaactggg 
tacgcgttgg 
cttatgagaa 
ctgagatggg 
tgtagttttg 
tataggtcac 
agtttgtttg 
ttttatggct 
actaactcta 
aacctccttc 
gaaagtcaac 
cgcgctgatc 
ggatttgaag 
tgagtgggtc 
agccgagttc 
gaacatgtca 



aacataacat 
attattttta 
tattattcat 
taaaaaatgt 
catttaatac 
gagtaggcaa 
agtatgcaac 
tacattcaat 
cgcggatgca 
gtcattaact 
aaactcgtat 
tctcttctct 
tggttcaatt 
tccatctagg 
actttcagat 
ataaaatcac 
gtttggtgat 
ttttgttttc 
aataggaatt 
tatatatgtt 
gaaatttttg 
ggcttgttgt 
ccctcccact 
gtctttgtga 
gaaccagatg 
aagggtgagg 
catgtcctca 
ctccaaacgc 
gtgcccatcg 

10 



ggatatcttc 
agatattaat 
gatttttcat 
tgcaagacac 
attaaaaaat 
cctggcattg 
aaacaaatca 
tttcaactca 
aacggttgaa 
cacccctgcc 
cgcttcagag 
tctacctctc 
tcggatcttg 
atccatgtga 
ctagtcgaaa 
ttttttttta 
tacagaattt 
ttgtttctca 
caaactttga 
taaaaaatga 
catgtgtctt 
gcagtttttg 
ctctctccaa 
acttcagggc 
gagggaagct 
ctctttcgtt 
gcgaaggatg 
ttcatttcaa 
tgctggctat 



cttaccaatc 
taagaaatta 
acatttgatt 
ttattagaca 
atttaatact 
aaacgagaga 
aaatcaaagg 
gtcaacggtt 
tctaacccac 
acccggtttc 
ttgagaccaa 
aaggtacttt 
cacttctggt 
aactctactc 
tcatttcata 
tttcaaaata 
tagcgaattt 
tacattcctt 
gcaggggaat 
aacttttgct 
tgctctctgt 
aagtataacc 
aaccttcgat 
gaggaggcac 
cgtggagctt 
gccgaggatc 
ggccacaccc 
ctcgctccga 
tgatgatgcg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 



cagaagcatc 
gttgcaattc 
acttggggaa 
aattggttga 
gatcattttc 
gctgtgtttg 
gacacccgaa 
cttggaggct 
aaggtacttg 
cccatgcact 
ggggctaact 
agagatctgt 
cgtctaaaca 
ttctttgacc 
ctggcaagga 
ctggttgatt 
ccagcttaat 
cttatgtatt 
ataataaatt 
ggggcccggt 
gaccacgnac 
gaatgcaatg 
tcacctcata 
gaattatgct 
ctcacgcatg 
aaatcatcca 
cacgttttca 
caaacatttg 
tcatataatt 
tatttatgag 
aaaacaaaat 
tagatcgatc 
aaaggtacat 



ggatcgggga 
tcaataatat 
ccattgcccc 
ttgggggtga 
gtctatctcc 
ccttccagct 
agcgccttct 
acaccaaagc 
aggatggtgt 
atgctggacc 
tctatatcgt 
atgatgctga 
ttcttccttt 
cttcaaggcc 
acaaagaaag 
actatgatag 
cttgtatcat 
ctcctgctaa 
aaaactttgg 
acctaatagt 
gccggaaggt 
atgaatatga 
acgtgcatca 
cgttggagga 
cattcatcaa 
gcgtgattgg 
ataaggacga 
gcaataaagt 
tctgttgaat 
atgggttttt 
atagcgcgca 
aaacttcggt 
cggtcgacga 



taacaaaaag 
tgagatttat 
tggcctacct 
cctagaggtc 
ggcacaactc 
ccggaatcct 
tgagatgggc 
tgatgatgtc 
tcttgatcca 
cacggaggtg 
tggtcgtgac 
ccatggaaag 
cagggttgct 
tcaggacttc 
tcctcctgat 
cttagtactc 
atcataatgt 
gacatacttc 
ggaggcaaaa 
gagatccaac 
tgccgcagcg 
tactgactat 
tgcatgccct 
tgtcgcggca 
tattattcat 
taacttcagt 
gatggtggag 
ttcttaagat 
tacgttaagc 
atgattagag 
aactaggata 
actgtgtaat 
gctc 



gttgctcttt 
aagcatccta 
tatgttgaac 
attgaaccaa 
cgtgcagagt 
gttcacaatg 
tataagaatc 
ccacttgatt 
gagacaactg 
cagtggcatg 
cccgcaggca 
aaagtattga 
gcatatgaca 
ctgttcatat 
ggatttatgt 
tcaagcaacg 
atatatctca 
acgaggatcc 
aaaaaaaaaa 
acttacgttt 
tgtggattgc 
gaaactttga 
gacaacatgg 
attgcagcta 
gcggggaaag 
tccagcgact 
taaagaagga 
tgaatcctgt 
atgtaataat 
tcccgcaatt 
aattatcgcg 
gacgatgagc 



ttgattccaa 
aagaagaaag 
aaactataac 
ttcagtacaa 
tcacaaggcg 
gccatgcttt 
ctgtcctctt 
ggcgaatgaa 
tggtatccat 
caaaggctag 
tgagccatcc 
gcatggcacc 
agactcaggg 
caggcacaaa 
gccctggtgg 
gcaaagtgca 
tgattgggag 
tctggcccaa 
aaaaaaaaaa 
gcaacgtcca 
gtctcaattc 
gggaatactg 
aacatcgcta 
ttgccaacat 
gcaagattaa 
tgattcgttt 
gtgcgtcgaa 
tgccggtctt 
taacatgtaa 
atacatttaa 
cgcggtgtca 
aatcgagagg 



gggagacccc 
aatagcccga 
caatgctgga 
tgatggactt 
caatgcggat 
gctaatgact 
gcttcatcca 
gcaacatgag 
attcccatct 
gatcaatgca 
agttgagaaa 
gggactagag 
taaaatggca 
gatgcgcaca 
atggaaggtg 
ggaagctgtt 
aaaccttaag 
tctaataata 
aactcgaggg 
agagcaaata 
tctcttgcag 
cctagcaccg 
tttttctgaa 
cgaactaccc 
tccaactggc 
tggtgctacc 
gcagatcgtt 
gcgatgatta 
tgcatgacgt 
tacgcgatag 
tctatgttac 
ctgactaaca 



1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3684 1 



<210> 20 

<211> 3963 

<212> DNA 

<213> Artificial Sequence 



<220> 
<223> 



Description of Artificial Sequence : chimeric gene 



<400> 20 

aagctttgct 

atactaatta 

aaagattttt 

ttgataataa 

tagtcttgtt 

aacagtagaa 

aagagagtca 

gcaaaggctg 

gagattcact 

aatccaatct 

cctataaatt 

gacacactcg 

tcttctccct 

ttgctttgcc 

tttctttaat 

attgcctttc 

aaccttgggc 



ctagatcaaa 
ttttgggtta 
taaaaaaatg 
atatattttt 
ctgtttacaa 
tcttcttgtg 
gaaccagaag 
gggttggctc 
ctgacttccc 
cgttacttag 
ggaactcaat 
ttcatatatc 
ctaccaaatc 
ttgctttttc 
atctgcggaa 
tttcttttag 
cttgtgctga 



ctcacatcca 
aatattaatc 
tataaaatta 
tttaatttct 
aagcattcat 
agtggtgtgg 
acaaataaaa 
aattggttgc 
caatctaagc 
gggcttttcc 
gctcccctct 
tctctgctct 
ctagattccg 
ctcaactggg 
tacgcgttgg 
cttatgagaa 
ctgagatggg 



aacataacat 
attattttta 
tattattcat 
taaaaaatgt 
catttaatac 
gagtaggcaa 
agtatgcaac 
tacattcaat 
cgcggatgca 
gtcattaact 
aaactcgtat 
tctcttctct 
tggttcaatt 
tccatctagg 
actttcagat 
ataaaatcac 
gtttggtgat 



ggatatcttc 
agatattaat 
gatttttcat 
tgcaagacac 
attaaaaaat 
cctggcattg 
aaacaaatca 
tttcaactca 
aacggttgaa 
cacccctgcc 
cgcttcagag 
tctacctctc 
tcggatcttg 
atccatgtga 
ctagtcgaaa 
ttttttttta 
tacagaattt 



cttaccaatc 
taagaaatta 
acatttgatt 
ttattagaca 
atttaatact 
aaacgagaga 
aaatcaaagg 
gtcaacggtt 
tctaacccac 
acccggtttc 
ttgagaccaa 
aaggtacttt 
cacttctggt 
aactctactc 
tcatttcata 
tttcaaaata 
tagcgaattt 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



11 



tgtaattgta 
aggcttcaat 
taatcccttc 
ttaaattcta 
tgtaaattta 
atggccgttt 
tcgccccccc 
tccggcgcgg 
cgccagctaa 
gccgcttcgt 
gcggccaccg 
gacgtcgttg 
tattcttcat 
aggggtattg 
aagaaaaccg 
cgctatggaa 
gaatcaactg 
gttccagctg 
attgagactt 
ggagccttgg 
accaatccat 
gggactttgc 
cttggcgctg 
cttggtggtt 
gttttgggtg 
ctgcatctcc 
gcacatccca 
cttgccaaga 
ctacatacca 
ggtggctgtg 
tcagaaaggg 
gattttgagg 
ctgattcacc 
gattcaattg 
ctaatttgga 
aaaaaaaaaa 
agatccaaca 
gccgcagcgt 
actgactatg 
gcatgccctg 
gtcgcggcaa 
attattcatg 
aacttcagtt 
atggtggagt 
tcttaagatt 
acgttaagca 
tgattagagt 
actaggataa 
ctgtgtaatg 
etc 



cttgtttgtc 
tttattcgag 
cttcaaatcc 
ttataacttt 
ctgtttaggt 
cgagctcgca 
cgccgtcctt 
cgtttcacgg 
gcaccaaggc 
ggtcgaacaa 
ccacggacgc 
tctccgccgc 
ttttgaaatc 
agactgatgg 
ctgatctcat 
acccaacgac 
tgataatggc 
gtggacatct 
ttcttccaaa 
aatctgeatt 
tcctgagatg 
tctgtattga 
atctgattct 
gcataagtgg 
gtacacttaa 
gtgtacagca 
aggtgaagcg 
ggcagatgac 
caataaaatt 
agagcattgt 
ccaagtacaa 
atttgaaggc 
caagtttttt 
aattttgatc 
tgttatgaaa 
aaaaaaaaaa 
cttacgtttg 
gtggattgcg 
aaactttgag 
acaacatgga 
ttgeagctat 
eggggaaagg 
ccagcgactt 
aaagaaggag 
gaatcctgtt 
tgtaataatt 
cccgcaatta 
attatcgege 
acgatgagca 



tgtagttttg 
tataggtcac 
agtttgtttg 
ttttatggct 
actaactcta 
catgegttte 
cgacaacctc 
catctcctcc 
gcgccgcaac 
cagcgacaac 
cgctacggtg 
ggcagacgag 
egatgeaage 
aattaccacc 
tgatttcaag 
ggtggttctg 
gtctgggatg 
tgtgaccact 
gatggggatc 
ggagcagcac 
tgttgatatt 
tggtacattt 
gcactcctta 
ttcaattaag 
cccgaatgct 
gcagaattca 
ggtctactat 
tggtttcggt 
tattgattca 
ggatcaacct 
gatttatgac 
tgatgtcctg 
tcttttataa 
gtctaatgtt 
cttgtgatga 
aaaaaaaaaa 
caacgtccaa 
tctcaattct 
ggaatactgc 
acategctat 
tgccaacatc 
caagattaat 
gattcgtttt 
tgcgtcgaag 
geeggtcttg 
aacatgtaat 
tacatttaat 
gcggtgtcat 
atcgagaggc 



ttttgttttc 
aataggaatt 
tatatatgtt 
gaaatttttg 
ggcttgttgt 
acctttgagt 
cgccgccgaa 
ctcatcctcc 
tgcagcaaca 
tctccggccg 
cctctccccg 
aacggggctg 
aaaacgattc 
cctgtggtta 
gagaategtc 
gaggagaaga 
tgtgctagcg 
acggattgtt 
acgaccactg 
aatgtgtctc 
aagctggttt 
gcaactccat 
acaaaataca 
gtggtttcgc 
gcatacctat 
acaggaatga 
ccaggcttgc 
ggtgttgtca 
ttgaaaatcc 
gctattttgt 
aacctggttc 
caagctctgg 
ttgtgctatt 
ctgttggaat 
attgttgaaa 
actcgagggg 
gagcaaatag 
etcttgeagg 
ctagcaccgt 
ttttctgaag 
gaactacccc 
ccaactggca 
ggtgctaccc 
cagategtte 
cgatgattat 
geatgaegtt 
aegegataga 
ctatgttact 
tgactaacaa 



ttgtttctca 
caaactttga 
taaaaaatga 
catgtgtctt 
gcagtttttg 
gccgctccga 
acttccgctc 
gcttccctcc 
tcggcgtcgc 
ccggggctcc 
tcgtcgtcgc 
tacagttaaa 
atgccgctga 
acacttctgc 
aagtgagtta 
taagtgcatt 
tagtcctgtt 
ataggaagac 
taattgatcc 
tattcttcac 
cagagctttg 
tgaaccagaa 
tgggtggaca 
aaatteggae 
tcatcagagg 
ggatggccaa 
cgagtcaccc 
gttttgagat 
catatattgc 
cttactggga 
gcttcagctt 
aagctatata 
tgtttgttat 
tgtgttaaga 
ttgaaacctc 
gggcccggta 
accacgnacg 
aatgcaatga 
cacctcataa 
aattatgetc 
tcacgcatgc 
aatcatccag 
acgttttcaa 
aaacatttgg 
catataattt 
atttatgaga 
aaacaaaata 
agatcgatca 
aaggtacatc 



tacattcctt 
gcaggggaat 
aacttttget 
tgctctctgt 
aagtataacc 
tcccgatttc 
ctccgcagga 
caacttccag 
geaaategtc 
ggcgccgccc 
cgccaacgag 
cagtagttct 
aagactgggt 
ctactttttt 
tgaatacggg 
ggagggggee 
tatggcactg 
tagaatattc 
agcagatgtt 
tgagtctcct 
ccacaagaag 
ggcccttgcc 
tcatgatgtc 
tttgeaccat 
catgaaaacg 
acttttagag 
tgaacatgag 
tgatggagat 
ggcctcgttt 
tcttcctcag 
tggagttgaa 
gacagttttc 
cacatctggc 
tgaatggtct 
tatttgatga 
cctaatagtg 
ccggaaggtt 
tgaatatgat 
cgtgcatcat 
gttggaggat 
attcatcaat 
cgtgattggt 
taaggacgag 
caataaagtt 
ctgttgaatt 
tgggttttta 
tagegegcaa 
aactteggta 
ggtcgacgag 



1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
3963 
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